Tải bản đầy đủ (.pdf) (139 trang)

Nghiên cứu và phát triển các phương pháp nhận dạng cây dựa trên nhiều ảnh bộ phận của cây, có tương tác với người sử dụng

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (24.9 MB, 139 trang )

HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY

NGUYEN THI THANH NHAN

INTERACTIVE AND MULTI-ORGAN
BASED PLANT SPECIES
IDENTIFICATION

Major: Computer Science
Code: 9480101

INTERACTIVE AND MULTI-ORGAN BASED PLANT
SPECIES IDENTIFICATION

SUPERVISORS:
1. Assoc. Prof. Dr. Le Thi Lan
2. Assoc. Prof. Dr. Hoang Van Sam

Hanoi − 2020


HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY

Nguyen Thi Thanh Nhan

INTERACTIVE AND MULTI-ORGAN
BASED PLANT SPECIES
IDENTIFICATION

Major: Computer Science
Code: 9480101



DOCTORAL DISSERTATION OF
COMPUTER SCIENCE

SUPERVISORS:
1. Assoc. Prof. Dr. Le Thi Lan
2. Assoc. Prof. Dr. Hoang Van Sam

Hanoi − 2020


DECLARATION OF AUTHORSHIP
I, Nguyen Thi Thanh Nhan, declare that this dissertation entitled, ”Interactive
and multi-organ based plant species identification”, and the work presented in it is my
own.
I confirm that:
This work was done wholly or mainly while in candidature for a Ph.D. research
degree at Hanoi University of Science and Technology.
Where any part of this dissertation has previously been submitted for a degree
or any other qualification at Hanoi University of Science and Technology or any
other institution, this has been clearly stated.
Where I have consulted the published work of others, this is always clearly attributed.
Where I have quoted from the work of others, the source is always given. With
the exception of such quotations, this dissertation is entirely my own work.
I have acknowledged all main sources of help.
Where the dissertation is based on work done by myself jointly with others, I
have made exactly what was done by others and what I have contributed myself.

Hanoi, January, 2020
PhD Student


Nguyen Thi Thanh Nhan

SUPERVISORS

i


ACKNOWLEDGEMENT
First of all, I would like to thank my supervisors Assoc. Prof. Dr. Le Thi
Lan at The International Research Institute MICA - Hanoi University of Science and
Technology, Assoc. Prof. Dr. Hoang Van Sam at Vietnam National University of
Forestry for their inspiration, guidance, and advice. Their guidance helped me all the
time of research and writing this dissertation.
Besides my advisors, I would like to thank Dr. Vu Hai, Assoc. Prof. Dr. Tran
Thi Thanh Hai for their great discussion. Special thanks to my friends/colleagues in
MICA, Hanoi University of Science and Technology: Hoang Van Nam, Nguyen Hong
Quan, Nguyen Van Toi, Duong Nam Duong, Le Van Tuan, Nguyen Huy Hoang, Do
Thanh Binh for their technical supports. They have assisted me a lot in my research
process as well as they are co-authored in the published papers.
Moreover, I would like to thank reviewers of scientific conferences, journals and
protection council, reviewers, they help me with many useful comments.
I would like to express a since gratitude to the Management Board of MICA Institute. I would like to thank the Thai Nguyen University of Information and Communication Technology, Thai Nguyen over the years both at my career work and outside
of the work.
As a Ph.D. student of the 911 program, I would like to thank this program for
financial support. I also gratefully acknowledge the financial support for attending
the conferences from the Collaborative Research Program for Common Regional Issue (CRC) funded by ASEAN University Network (Aun-Seed/Net), under the grant
reference HUST/CRC/1501 and NAFOSTED (grant number 106.06-2018.23).
Special thanks to my family, to my parents-in-law who took care of my family and
created favorable conditions for me to study. I also would like to thank my beloved

husband and children for everything they supported and encouraged me for a long time
to study.
Hanoi, January, 2020
Ph.D. Student
Nguyen Thi Thanh Nhan

ii


CONTENTS
DECLARATION OF AUTHORSHIP

i

ACKNOWLEDGEMENT

ii

CONTENTS

v

SYMBOLS

vi

SYMBOLS

viii


LIST OF TABLES

x

LIST OF FIGURES

xiv

INTRODUCTION

1

1 LITERATURE REVIEW
1.1 Plant identification . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.1 Manual plant identification . . . . . . . . . . . . . . . . . . .
1.1.2 Plant identification based on semi-automatic graphic tool . .
1.1.3 Automated plant identification . . . . . . . . . . . . . . . .
1.2 Automatic plant identification from images of single organ . . . . .
1.2.1 Introducing the plant organs . . . . . . . . . . . . . . . . . .
1.2.2 General model of image-based plant identification . . . . . .
1.2.3 Preprocessing techniques for images of plant . . . . . . . . .
1.2.4 Feature extraction . . . . . . . . . . . . . . . . . . . . . . .
1.2.4.1 Hand-designed features . . . . . . . . . . . . . . . .
1.2.4.2 Deeply-learned features . . . . . . . . . . . . . . .
1.2.5 Classification methods . . . . . . . . . . . . . . . . . . . . .
1.3 Plant identification from images of multiple organs . . . . . . . . .
1.3.1 Early fusion techniques for plant identification from images
multiple organs . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.2 Late fusion techniques for plant identification from images
multiple organs . . . . . . . . . . . . . . . . . . . . . . . . .

1.4 Plant identification studies in Vietnam . . . . . . . . . . . . . . . .
1.5 Plant data collection and identification systems . . . . . . . . . . .
1.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

iii

. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
of
. .
of
. .
. .
. .
. .

10
10
10

12
12
13
13
16
18
20
20
22
26
28
30
31
33
35
44


2 LEAF-BASED PLANT IDENTIFICATION METHOD BASED ON
KERNEL DESCRIPTOR
2.1 The framework of leaf-based plant identification method . . . . . . . .
2.2 Interactive segmentation . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3 Feature extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.1 Pixel-level features extraction . . . . . . . . . . . . . . . . . . .
2.3.2 Patch-level features extraction . . . . . . . . . . . . . . . . . . .
2.3.2.1 Generate a set of patches from an image with adaptive
size . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.2.2 Compute patch-level feature . . . . . . . . . . . . . . .
2.3.3 Image-level features extraction . . . . . . . . . . . . . . . . . . .
2.4 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.4.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.1.1 ImageCLEF 2013 dataset . . . . . . . . . . . . . . . .
2.4.1.2 Flavia dataset . . . . . . . . . . . . . . . . . . . . . . .
2.4.1.3 LifeCLEF 2015 dataset . . . . . . . . . . . . . . . . . .
2.4.2 Experimental results . . . . . . . . . . . . . . . . . . . . . . . .
2.4.2.1 Results on ImageCLEF 2013 dataset . . . . . . . . . .
2.4.2.2 Results on Flavia dataset . . . . . . . . . . . . . . . .
2.4.2.3 Results on LifeCLEF 2015 dataset . . . . . . . . . . .
2.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 FUSION SCHEMES FOR MULTI-ORGAN BASED PLANT IDENTIFICATION
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 The proposed fusion scheme RHF . . . . . . . . . . . . . . . . . . . . .
3.3 The choice of classification model for single organ plant identification .
3.4 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.2 Single organ plant identification results . . . . . . . . . . . . . .
3.4.3 Evaluation of the proposed fusion scheme in multi-organ plant
identification . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45
45
46
50
50
51
51
52
55
56

56
56
57
57
58
58
60
61
66

67
67
69
75
77
78
79
79
87

4 A FRAMEWORK FOR AUTOMATIC PLANT IDENTIFICATION
WITHOUT DEDICATED DATASET AND A CASE STUDY FOR
BUILDING IMAGE-BASED PLANT RETRIEVAL
88
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.2 Challenges of building automatic plant identification systems . . . . . . 88

iv



4.3
4.4
4.5
4.6

The framework for building automatic plant identification system without dedicated dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Plant organ detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
Case study: Development of image-based plant retrieval in VnMed application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

CONCLUSIONS AND FUTURE WORKS
105
4.6.1 Short term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.6.2 Long term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Bibliography

108

PUBLICATIONS

121

APPENDIX

122

v


ABBREVIATIONS

No. Abbreviation Meaning
1

AB

Ada Boost

2

ANN

Artificial Neural Network

3

Br

Branch

4

CBF

Classification Base Fusion

5

CNN

Convolution Neural Network


6

CNNs

Convolution Neural Networks

7

CPU

Central Processing Unit

8

CMC

Cumulative Match Characteristic Curve

9

DT

Decision Tree

10

En

Entire


11

FC

Fully Connected

12

Fl

Flower

13

FN

False Negative

14

FP

False Positive

15

GPU

Graphics Processing Unit


16

GUI

Graphic-User Interface

17

HOG

Histogram of Oriented Gradients

18

ILSVRC

ImageNet Large Scale Visual Recognition Competition

19

KDES

Kernel DEScriptors

20

KNN

K Nearest Neighbors


21

Le

Leaf

22

L-SVM

Linear Support Vector Machine

23

MCDCNN

Multi Column Deep Convolutional Neural Networks

24

NB

Naive Bayes

25

NNB

Nearest NeighBor


26

OPENCV

OPEN source Computer Vision Library

27

PC

Persional Computer

28

PCA

Principal Component Analysis

29

PNN

Probabilistic Neural Network

30

QDA

Quadratic Discriminant Analysis


vi


31

RAM

Random Acess Memory

32

ReLU

Rectified Linear Unit

33

RHF

Robust Hybrid Fusion

34

RF

Random Forest

35


ROI

Region Of Interest

36

SIFT

Scale-Invariant Feature Transform

37

SM

SoftMax

38

SURF

Speeded Up Robust Features

39

SVM

Support Vector Machine

40


SVM-RBF

Support Vector Machine-Radial Basic Function kernel

41

TP

True Positive

42

TN

True Negative

vii


MATH SYMBOLS
No. Symbol

Meaning

1

Summation - sum of all values in range of series

2


R

3

R

4

π

5

Set of real number
d

Set of real number has d dimensions
π = 3.141592654...

w

L2 normalize of vector w

6

xi

The i-th element of vector x

7


sign(x)

The sign function that determines the sign. Equals 1 if x ≥ 0, −1
if x < 0

8



Is member of

9

max

The function takes the largest number from a list

10

arctan(x)

It returns the angle whose tangent is a given number

11

cos(θ)

Function of calculating cosine value of angle θ

12


sin(θ)

Function of calculating sine value of angle θ

13

m(z)

The magnitude of the gradient vector at pixel z

14

The orientation of gradient vector at pixel z

15

θ(z)
˜
θ(z)

The normalized gradient vector

16

exp(x)

ex

17


argmax(x)

It indicates the element that reaches its maximum value

18



The Kronecker product

19

xT

Transposition of vector x

20

Product of all values in range of series

21

q

The query-image set

22

si (Ik )


The confidence score of the plant species i−th when using image Ik
as a query from a single organ plant

23

c

The predicted class of the species for the query q

24

C

The number of species in dataset

25

km˜

The gradient magnitude kernel

26

ko

The orientation kernel

27


kp

The position kernel

28

m(z)
˜

The normalized gradient magnitude

viii


LIST OF TABLES

Table 1.1

Example dichotomous key for leaves . . . . . . . . . . . . . . . .

11

Table 1.2

Methods of plant identification based on hand-designed features .

21

Table 1.3 A summary of available crowdsourcing systems for plant information collection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


37

Table 1.4 The highest results of the contest obtained with the same recognition approach using hand-crafted feature. . . . . . . . . . . . . . . . .

41

Table 2.1

Leaf/leafscan dataset of LifeCLEF 2015 . . . . . . . . . . . . . .

57

Table 2.2

Accuracy obtained in six experiments with ImageCLEF 2013 dataset. 59

Table 2.3 Precision, Recall and F-measure in improved KDES with Interactive segmentation for ImageCLEF 2013 dataset . . . . . . . . . . . . .

63

Table 2.4 Comparison of the proposed method with the state-of-the-art
hand-designed features-based methods on Flavia dataset. . . . . . . . .

63

Table 2.5 Precision, Recall and F-measure of the proposed method for Flavia
dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

64


Table 3.1 An example of test phase results and the retrieved plant list determination using the proposed approach. . . . . . . . . . . . . . . . .

72

Table 3.2

The collected dataset of 50 species with four organs . . . . . . .

78

Table 3.3 Single organ plant identification accuracies with two schemes: (1)
An CNN for each organ; (2) An CNN for all organs. The best result is
in bold. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

80

Table 3.4 Obtained accuracy at rank-1, rank-5 when combining each pair of
organs with different fusion schemes in case of using AlexNet. The best
result for each pair of organ is in bold. . . . . . . . . . . . . . . . . . .

81

Table 3.5 Obtained accuracy at rank-1, rank-5 when combining each pair of
organs with different fusion schemes in case of using ResNet. The best
result is in bold. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

81

ix



Table 3.6 Obtained accuracy at rank-1, rank-5 when combining each pair of
organs with different fusion schemes in case of using GoogLeNet. The
best result is in bold. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

82

Table 3.7 Comparison of the proposed fusion schemes with the state of the
art method named MCDCNN [76]. The best result is in bold. . . . . .

83

Table 3.8 Rank number (k) where 99% accuracy rate is achieved in case of
using AlexNet. The best result is in bold. . . . . . . . . . . . . . . . . .

84

Table 3.9 Rank number (k) to achieve a 99% accuracy rate in case of using
for ResNet. The best result is in bold. . . . . . . . . . . . . . . . . . .

86

Table 4.1

Plant images dataset using conventional approach . . . . . . . .

89

Table 4.2


Plant images dataset built by crowdsourcing data collection tools.

91

Table 4.3

Dataset used for evaluating organ detection method. . . . . . . .

94

Table 4.4 The organ detection performance of the GoogLeNet with different
weight initialization. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

95

Table 4.5

Confusion matrix for plant organ detection obtained (%) . . . . .

96

Table 4.6 Precision, Recall and F-measure for organ detection with LifeCLEF2015 dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

97

Table 4.7 Confusion matrix for detection 6 organs of 100 Vietnam species
on VnDataset2 (%) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Table 4.8

Four Vietnamese medicinal species databases . . . . . . . . . . . 102


Table 4.9

Results for Vietnamese medicinal plant identification. . . . . . . 102

x


LIST OF FIGURES
Figure 1

Automatic plant identification . . . . . . . . . . . . . . . . . . .

2

Figure 2

Examples of these terminologies used in the thesis . . . . . . . .

3

Figure 3

One observation of a plant . . . . . . . . . . . . . . . . . . . . .

4

Figure 4
(a) Example of large inter-class similarity: leaves of two distinct
species are very similar; (b) example of large intra-class variation: leaves

of the same species vary significantly due to the growth stage. . . . . .

5

Figure 5
Challenges of plant identification. (a) Viewpoint variation; (b)
Occlusion; (c) Clutter; (d) Lighting variation; (e) color variation of same
species . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5

Figure 6

Confusion matrix for two-class classification. . . . . . . . . . . .

6

Figure 7

A general framework of plant identification. . . . . . . . . . . .

8

Figure 1.1

Botany students identifying plants using manual approach . . .

11

Figure 1.2 (a) Main graphical interface of IDAO; (b), (c), (d) Graphical icons

for describing characteristics of leaf, fruit and flower respectively. [13] .

12

Figure 1.3

13

Snapshots of Leafsnap (left) and Pl@ntNet (right) applications

Figure 1.4 Some types of leaves: a,b) leaves on simple and complex background of the Acer pseudop latanus L, c) a single leaf of the Cercis
siliquastrum L, d) a compound leaf of the Sorbus aucuparia L,. . . . . .

14

Figure 1.5 Illustration of flower inflorescence types (structure of the flower(s)
on the plant, how they are connected between them and within the
plant) [11]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15

Figure 1.6

The visual diversity of the stem of the Crataegus monogyna Jacq. 16

Figure 1.7

Some examples branch images . . . . . . . . . . . . . . . . . . .

16


Figure 1.8

The entire views for Acer pseudoplatanus L. . . . . . . . . . . .

17

Figure 1.9

Fundamental steps for image-based plant species identification .

17

Figure 1.10 Accuracy of plant identificaiton based on leaf images on complex
background in the ImageCLEF 2012 [18] . . . . . . . . . . . . . . . . .

19

xi


Figure 1.11 Feature visualization of convolutional net trained on ImageNet
from [58] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

Figure 1.12 Architecture of a Convolutional Neural Network. . . . . . . . . .

23


Figure 1.13 Hyperplane separates data samples into 2 layers . . . . . . . . .

27

Figure 1.14 Two fusion approaches, (a) early fusion, (b) late fusion . . . . .

29

Figure 1.15 Early fusion method in [74]

. . . . . . . . . . . . . . . . . . . .

31

Figure 1.16 Different types of fusion strategies [75] . . . . . . . . . . . . . .

31

Figure 1.17 Some snapshot images of Pl@ntNet (source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Figure 1.18 Obtained results on three flower datasets. Identification rate
reduces when the number of species increases. . . . . . . . . . . . . . .

42

Figure 1.19 Comparing the performances of datasets consisting of 50 species.
Blue bar: The performances on original dataset collected from LifeCLEF; Red bar: Performances with riched datasets. The species on two
datasets are identical. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

43


Figure 2.1

The complex background leaf image plant identification framework. 46

Figure 2.2

The interactive segmentation scheme. . . . . . . . . . . . . . . .

47

Figure 2.3 Standardize the direction of leaf. (a): leaf image after segmentation; (b): Convert to binary image; (c): Define leaf boundary using
Canny filter; (d): Standardized image direction. . . . . . . . . . . . . .

49

Figure 2.4 Examples of leafscan and leaf, the first row are raw images, the
second row are images after applying corresponding pre-processing techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

50

Figure 2.5 An example of the uniform patch in the original KDES and the
adaptive patch in our method. (a,b) two images of the same leaf with
different sizes are divided using uniform patch; (b,c): two images of the
same leaf with different sizes are divided using adaptive patch. . . . . .

52

Figure 2.6 An example of patches and cells in an image and how to convert
adaptive cells. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


53

Figure 2.7 Construction of image-level feature concatenating feature vectors
of cells in layers of hand pyramid structure. . . . . . . . . . . . . . . .

56

xii


Figure 2.8

32 images of 32 species of Flavia dataset . . . . . . . . . . . . .

57

Figure 2.9 Interactive segmentation developed for mobile devices. Top left:
original image, top right: markers, bottom right: boundary with Watershed, bottom left: segmented leaf . . . . . . . . . . . . . . . . . . . . .

58

Figure 2.10 Some imprecise results of image segmentation. . . . . . . . . . .

59

Figure 2.11 Detail accuracies obtained on ImageCLEF 2013 dataset in my
experiments. For some classes such as Mespilus germanica, the obtained
accuracy in the 4 experiments is 0%. . . . . . . . . . . . . . . . . . . .

62


Figure 2.12 Detailed scores obtained for Leaf Scan [1], my team’s name is Mica. 64
Figure 2.13 Detailed scores obtained for all organs [1], my team’s name is Mica. 65
Figure 3.1 An example of a two plant species that are similar in leaf but
different in flower (left) and those are similar in leaf and different in fruits. 68
Figure 3.2

The framework for multi-organ plant identification

. . . . . . .

68

Figure 3.3

Explanation for positive and negative samples. . . . . . . . . . .

71

Figure 3.4 Illustration of positive and negative samples definition. With
a pair of images from leaf (a) and flower (c) of the species #326, the
corresponding confidence score of all species in the dataset (e.g., 50)
when using leaf and flower image are shown in (b). . . . . . . . . . . .

71

Figure 3.5 In RHF method, each species has an SVM model based on its
positive and negative samples . . . . . . . . . . . . . . . . . . . . . . .

73


Figure 3.6 The process of computing the corresponding positive probabilities
for a query using the RHF method. . . . . . . . . . . . . . . . . . . . .

73

Figure 3.7

AlexNet architecture taken from [46] . . . . . . . . . . . . . . .

75

Figure 3.8

ResNet50 architecture taken from [139] . . . . . . . . . . . . . .

76

Figure 3.9

A schematic view of GoogLeNet architecture [60] . . . . . . . .

77

Figure 3.10 Single organ plant identification . . . . . . . . . . . . . . . . . .

77

Figure 3.11 Comparison of identification results using leaf, flower, and both
leaf and flower images. The first column are query images. The second

column shows top 5 species returned by the classifier. The third column
is the corresponding confidence score for each species. The name of
species in the groundtruth is Robinia pseudoacacia L. . . . . . . . .

82

xiii


Figure 3.12 Cumulative Match Characteristic curve obtained by the proposed
method with AlexNet (Scheme 1 for single organ identification) . . . .

84

Figure 3.13 Cumulative Match Characteristic curve obtained by the proposed
method with ResNet (Scheme 1 for single organ identification) . . . . .

85

Figure 3.14 Cumulative Match Characteristic curve obtained by the propsed
method with AlexNet (Scheme 2 for single organ identification) . . . .

85

Figure 3.15 Cumulative Match Characteristic curve obtained by the proposed
method with ResNet (Scheme 2 for single organ identification) . . . . .

86

Figure 4.1


Some challenges in plant and non-plant classification . . . . . .

90

Figure 4.2

Illustration of difficult cases for plant organ detection. . . . . . .

91

Figure 4.3 The proposed framework for building automatic plant identification system without dedicated dataset. . . . . . . . . . . . . . . . . . .

92

Figure 4.4 Some images of data collection for two species: (a) Camellia
sinensis, (b) Terminalia catappa. First row shows images are collected
by manual image acquisitions, second row shows images are collected by
crowdsoucring. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

93

Figure 4.5

Some examples for wrong identification. . . . . . . . . . . . . . .

96

Figure 4.6 Visualization of the prediction of GoogleNet used for plant organ
detection. Red pixels are evidence for a class, and blue ones against it.


97

Figure 4.7 Detection results of the GoogLeNet with different classification
methods at the first rank (k=1) . . . . . . . . . . . . . . . . . . . . . .

98

Figure 4.8 Results obtained by the proposed GoogLeNet and the method
in [7] for six organs . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

99

Figure 4.9

Architecture of Vietnamese medicinal plant search system [124]

100

Figure 4.10 Snapshots of VnMed; a) list of species for a group of diseases; b)
a detail information for one species; c) a query image for plant identification; d) top five returned results. . . . . . . . . . . . . . . . . . . . . 100
Figure 4.11 Data distribution of 596 Vietnamese medicinal plants. . . . . . . 103
Figure 4.12 Illustration of image-based plant retrieval in VnMed. . . . . . . 104

xiv


INTRODUCTION
Motivation
Plants play an important part in ecosystem. They provide oxygen, food, fuel,

medicine, wood and help to reduce air pollution and prevent soil erosion. Good knowledge of flora allows to improve agricultural productivity, protects the biodiversity,
balances ecosystem and minimizes the effects of climate change. The purpose of plant
identification is matching a given specimen plant to a known taxon. This is considered as an important step to assess flora knowledge. The traditional identification of
plants is usually done by the botanists with specific botanical terms. However, this
process is complex, time-consuming, and even impossible for people in general who are
interested in acquiring the knowledge of species. Nowadays, the availability of relevant
technologies (e.g. digital cameras and mobile devices), image datasets and advance
techniques in image processing and pattern recognition makes the idea of automated
plants/species identification become true. The automatic plant identification can be
defined as the process of determining the name of species based on their observed images (see Figure 1). As each species has certain organs and each organ has its own
distinguishing power, the current automatic plant identification follows two main approaches: using only images of one sole organ type or combining images of different
organs.
In recent years, we have witnessed a significant performance improvement of
automatic plant identification in terms of both accuracy and the number of species
classes [1–4]. According to [4, 5], automatic plant identification results are lower than
the best experts but approximate to the experienced experts and far exceeds those of
beginners or amateurs in plant taxonomy. Based on the impressive results on automatic plant identification, some applications have been deployed and widely used such
as the Pl@ntNet [6], Leafsnap [7], MOSIR [8]. However, the use of plant identification
in reality still has to overcome some limitations. First, the number of covered plant
species (e.g., 10,000 in LifeCLEF [3]) is relatively small in comparison with the number of plant species on the earth (e.g., 400,000 [9]). Second, the accuracy of automatic
plant identification still needs to be improved. In our experiments (section 1.5) we have
shown that when the number of species increases, the rate of identification decreases
dramatically due to the inter-class similarity.

1


Figure 1 Automatic plant identification

Objective

The main aim of this thesis is to overcome the second limitation of the automatic
plant identification (low recognition accuracy) by proposing novel and robust methods
for plant recognition. For this, we first focus on improving the recognition accuracy
of plant identification based on images of one sole organ. Among different organs of
the plant, we select leaf as this organ is the most widely used in the literature [10].
However, according to [10], most analyzed images in the previous studies were taken
under simplified conditions (e.g., one mature leaf per image on a plain background).
Towards real-life application, the plant identification methods should be experimented
with more realistic images (e.g., having a complex background, and been taken in
different lighting conditions).
Second, taking into consideration that using one sole organ for plant identification
is not always relevant because one organ cannot fully reflect all information of a plant
due to the large inter-class similarity and the large intra-class variation. Therefore,
multi-organ plant identification is also studied in this thesis. In this thesis, multiorgan plant identification will be formulated as a late fusion problem: the multi-organ
plant results will be determined based on those obtained from single-organ. Therefore,
the thesis will focus on fusion schemes.
Finally, the last objective of the thesis is to build an application of Vietnamese
medicinal plant retrieval based on plant identification. By this application, the knowledge that previously only belongs to botanists can be now popular for the community.
To this end, the concrete objectives are:
❼ Develop a new method for leaf-based plant identification that is able to recognize

the plants of interest even in complex background images;

2


❼ Propose a fusion scheme in multiple-organ plant identification;
❼ Develop a image-based plant search module in Vietnamese medicinal plant re-

trieval application.


Context, constraints, and challenges
Our work based on an assumption that the query images are available. In real
applications, we require users to provide images of the to-be-identified plant by directly
capturing images in the field or selecting images in the existing albums. Through this
thesis, we use the following terminologies that are defined in plant identification task
of ImageCLEF [11]. Examples of these terminologies are illustrated in Figure 2.

Figure 2 Examples of these terminologies used in the thesis
❼ Image of plant is an image captured from a plant. This image contains at least

one type of organs. In this work, we focus on six main organs including leaf,
flower, fruit, branch, stem and entire.
❼ SheetAsBackground leaf images are pictures of leaves in front of a white or

colored uniform background produced by a scanner or a camera with a sheet,
these images are also named leafscan. The leafscan image can be divided to Scan
(scan of a single leaf) and Scan-like (photograph of a single leaf in front of a
uniform artificial background).

3


❼ NaturalBackground images are the directly-captured photographs from the

plant including one among 6 types of organ. It is worth to note that NaturalBackground images may contain more than one type of organs.
❼ Observation of a plant is a set of images captured from a single plant by the

same person in the same day using the same device and lightning conditions.
Figure 3 shows an observation of a plant which contains of five images.


Figure 3 One observation of a plant
The automatic plant identification has to face different challenges. The first challenge is the large inter-class similarity and the large intra-class variation. Figure 4(a)
illustrates the case of the large inter-class similarity (leaves of two distinct species are
very similar) while Figure 4(b) shows an example of the large intra-class variation
(leaves of the same species vary significantly due to the growth stage). The second
challenge is the background of the plant images is usually complex especially for NaturalBackground images. Data imbalance is the third challenge of automatic plant
identification as the distribution of plant species on the planet is diverse. The fourth
challenge is the high number of species. To the best of our knowledge, the biggest image
dataset of LifeCLEF 2017 contains more than 1.8M images of 10,000 plant species [3].
Finally, plants images are usually captured by different users with different acquisition
protocols. Therefore, they have lighting and viewpoint variations and may contain
occlusions, clutter, and object deformations. These issues are illustrated in Figure 5
with several species.

4


Figure 4 (a) Example of large inter-class similarity: leaves of two distinct species are
very similar; (b) example of large intra-class variation: leaves of the same species vary
significantly due to the growth stage.

Figure 5 Challenges of plant identification. (a) Viewpoint variation; (b) Occlusion;
(c) Clutter; (d) Lighting variation; (e) color variation of same species

5


Evaluation metrics
In plant identification, for each query containing one or multiple images of one or

several organs, a list of species being sorted according to the confidence score of the
method/system in the suggested species is provided. In order to evaluate the proposed
methods, in this thesis we employ five main metrics.
❼ Accuracy at rank k

The first metric denoted Accrank k is the accuracy obtained at rank k of the
method/system that is defined as follows:
Accrank k (%) =

Tk × 100%
N

(1)

where Tk is a total number of true recognition in the k first positions of the plant
lists ranked by the confidence score. N is the number of queries. With this
metric, we can obtain one accuracy for each value of k. The chosen value of k
may vary depending on the evaluation purpose. Besides the value 1 of k that is
used in all experimental evaluations, the others values such as 5, 10 are usually
used to evaluate the behavior of the algorithms at the 5 first results and 10 first
ones.
❼ Precision To analyze the behavior of classification problems, a confusion matrix

has been usually provided. Figure 6 illustrates a confusion matrix for a twoclass classification problem where TP (True Positive) and TN (True Negative)
represent the correct decisions while FP (False Positive) and FN (False Negative)
represent the errors. For a multi-class classification problem, we also build a
confusion matrix for each class if we consider that class to be positive, and the
remaining classes are combined into negative class. From confusion matrix, we
can compute three evaluation metrics that are Precision, Recall and F-measure
as follows:


Figure 6 Confusion matrix for two-class classification.

6


P recision =

TP
TP + FP

(2)

❼ Recall

Recall =

TP
TP + FN

(3)

❼ F-measure

F − measure =

2 × P resision × Recall
P recison + Recall

(4)


❼ Score at image level

The second metric denoted S is the score at image level. This metric is defined
and employed as an evaluation metric for plant identification in LifeCLEF 2015
competition [1]. The value of S is defined as:

1
S=
U

U

u=1

1
Pu

Pu

p=1

1
Nu,p

Nu,p

n=1

1

ru,p,n

(5)

where U is the number of users having at least one image in the test set, Pu is the
number of individual plants observed by the u-th user (within the test set), Nu,p
is the number of pictures of the p-th plant observation of the u-th user, ru,p,n is
the rank of the correct species within the ranked list of images returned by the
identification method. This metric allows compensating the long-tail distribution
effects occurring in social data (most users provide much less data, only a few
people provide huge quantities of data). The value of S ranges from 0 to 1. The
greater the value of S is, the better the identification method is.

Contributions
The dissertation has three main contributions as follows:
❼ Contribution 1: A complex background leaf-based plant identification method

has been proposed. The proposed method takes the advantages of segmentation
with a few interactions from the user to determine the leaf region. The features
are then extracted on this region by the representative power of Kernel Descriptor (KDES). The experimental results obtained on different benchmark datasets
have shown that the proposed method outperforms state of the art hand-crafted
feature-based methods.
7


❼ Contribution 2: One fusion scheme for two-organ based plant identification

has been introduced. The fusion is an integration between a product rule and a
classification-based approach.
❼ Contribution 3: Finally, an image-based plant searching module has been de-


veloped and deployed in Vietnamese medicinal plant retrieval application named
VnMed.

General framework and dissertation outline
In this dissertation, we propose a unified framework for plant identification. The
proposed framework consists of three main phases as illustrated in Figure 7. The first
step is to build a dataset based on crowdsourcing, then evaluate the data based on the
proposed organ detection to remove non-plant images and classify images by organs.
The second phase is plant identification at image level. The third phase is organ
combination. By utilizing this framework, we deploy a real application for Vietnamese
medicinal plant. Particularly, these research works in the dissertation are composed
into six chapters as following:
❼ Introduction: This section describes the main motivations and objectives of the

study. We also present critical points of the research’s context, constraints and
challenges that we meet and address in the dissertation. Additionally, the general
framework and main contributions of the dissertation are also presented.
❼ Chapter 1: A Literature Review: This chapter mainly surveys existing works and

approaches proposed for automatic plant identification.
❼ Chapter 2: In this chapter, a method for plant identification based on leaf image

is proposed. In the proposed method, to extract leaf region from images, we proposed to apply interactive segmentation. Then, the improved KDES is employed
to extract leaf features.

Figure 7 A general framework of plant identification.
8



❼ Chapter 3: This chapter focuses on multi-organ plant identification. We propose

a method named RHF (Robust Hybrid Fusion) for determining the result of
two-organ identification based on those of single-organ ones.
❼ Chapter 4: In this chapter, we propose a method of organ detection and an

application for Vietnamese medicinal plant retrieval system based on this method.
❼ Conclusion: We give some conclusions and discuss the limitations of the proposed

method. Research directions are also described for future works.

9


×