Tải bản đầy đủ (.pdf) (6 trang)

A Novel Ant Based Algorithm for Multiple Graph Alignment

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (168.45 KB, 6 trang )

A Novel Ant Based Algorithm for Multiple Graph
Alignment
Trần Ngọc Hà
Thai Nguyen University of Education


Đỗ Đức Đông
Vietnam National University-Hanoi


Abstract— Multiple graph alignment (MGA) is a new
approach to analyzing protein structure in order to exploring
their functional similarity. In this article, we propose a two-stage
memetic algorithm to solve the MGA problem, named ACOMGA2, based on ant colony optimization metaheuristic. A local
search procedure is applied only to the second stage of the
algorithm to save runtime. Experimental results have shown that
ACO-MGA2 outperforms state-of-the-art algorithms while
producing alignments of better quality.
Keywords—Multiple
Graph
Alignment,
Ant
Colony
Optimization, local search, memetic algorithm, SMMAS
pheromone update rule

I.

INTRODUCTION

Multiple sequence alignment is a useful approach for


analyzing evolutionary homology among DNA sequences or
proteins. However, this method is not suitable to determine the
functional similarities among the molecules because
functional similarities relate more closely to structural features
rather than the sequential ones [6,12,15,18,19].
Recently, a number of authors [1, 2, 10-12, 22-24] have
proposed using graphical models to represent threedimensional structures of proteins and using the graph
alignment techniques to infer functional similarities based on
structural analysis. These methods mainly use exact pair-wise
graph matching technique. They produce meaningful results
when studying the functional evolution of non-homologous
molecules. However, it is difficult to leverage these methods
to discover biological meaningful samples from approximately
saved ones.
Weskamp et al. [21] were the first (2007) to introduce the
concept of multiple graph alignment (MGA) and to use it to
analyze the protein active sites. They proposed a heuristic
algorithm according to greedy strategy. The graphs are used to
approximately describe binding pockets in Cavbase [8,14] . In
this approach, each binding pocket is modeled as a connected
graph G(V, E ) and MGA problem is stated as follows : Given
a set G ={G1(V1,E1),…,Gn(Vn,En)}which is a set of connected,
node-labeled, edge-weighted graphs. In each graph, there are
three edit operations: deletion or insertion of a vertex, change
of the label of a node, change of the weight of an edge. The
mission of the MGA problem is to find an alignment for the
vertices of the graphs belong to G to optimize a predefined
objective function.
MGA is NP-hard problem (see [6, 21]). The heuristic
algorithms are only suitable for small problems, hence, not


Hoàng Xuân Huấn
Vietnam National University-Hanoi


suitable for real applications. Fober et al [30] have extended
the usage of this problem for the structural analysis of
biomolecules and have proposed an evolutionary algorithm
called GAVEO. Experiments show that this algorithm is more
efficient than greedy algorithm although it is more time
consuming.
In [20] the authors proposed ACO-MGA algorithm that
using simply ant colony optimization scheme to solve the
multiple graph alignment problem. Experiment shows that this
algorithm has better results than the GAVEO algorithm;
however its running time is long and its efficiency is not good
for large data sets.
This paper introduces a two-stage memetic algorithm
based on ant colony optimization called ACO-MGA2 as an
improvement of the ACO-MGA to align multiple graphs. We
keep construction graph as in ACO-MGA, but improve the
heuristic information and the local search procedures. To
reduce the running time, the algorithm is split into two stages.
The local search is only applied at the second stage of the
memetic scheme [13]. It consists of two procedures: 1)
Rearranging the different labeled vertices in alignment vectors
in order to improve the compatibility of the vertices, 2)
Swapping identical labeled vertices on each graph to increase
the appropriateness of the edges’ weight. Improvements in
both runtime and efficiency of ACO-MGA2 is demonstrated

empirically by comparison with GAVEO and Greedy.
The rest of this paper is organized as follows: Section 2
provides mathematical statements for multiple graph
alignment problem and summarizes the related work. Section
3 introduces the newly proposed algorithm. The experimental
results are presented in Section 4. Several conclusions are
presented in the last section.
II. MULTIPLE GRAPH ALIGNMENT PROBLEM AND
RELATED WORKS

A. Multiple graph alignment problem
The multiple graph alignment problem is introduced [21]
by Weskamp et al, with the purpose of studying proteins
characteristics. Fober et al [6] extended it to analyze the
structure of molecules which includes the chemical
composition and the protein binding site by. Follows are the
problem statement (more details see [6, 20])
Definition 1. (Multigraph) Multigraph is a set of graphs G
= {G1(V1, E1),... , Gn(Vn, En)}, where Gi (Vi, Ei) is a connected


graph, each vertex is labeled under a given set L, the edges
weight represent the Euclidean distances between the vertices.
Definition 2. (Edit operations) There are following edit
operations to distinguish between a graph G(V; E) and another
graph:
i) Insertion or deletion of a node: A node v ∈ V and
edges associated with it can be deleted or inserted.
ii) Change of the label of a node: The label ݈(‫ )ݒ‬of a
node ‫ ܸ ∈ ݒ‬can be replaced by other label in L.

iii) Change of the weight of an edge. The weight ‫ )݁(ݓ‬of
an edge ݁ can be changed based on the conformation.
Definition 3. (Multiple Graph Alignment). Let multigraph
G ={G1(V1,E1),…,Gn(Vn,En)}, adding to each vertex set Vi a
dummy node (denoted ⊥) that is not connected to the other
nodes. An alignment of G is defined as follows. Then
A ⊆ (V1 ∪ {⊥}) × ... × (Vn ∪ {⊥}) is an alignment of
multigraph G if and only if:
i) For all i=1,…,n and for each ‫ܸ ∈ ݒ‬௜ , there exists
exactly one a = (a1,…,an) ∈ ‫ ܣ‬such that ‫ܽ = ݒ‬௜
ii) For each a = (a1,…,an) ∈ ‫ܣ‬, there exists at least one 1
≤ i ≤ n such that ܽ௜ ≠ ⊥
Each a = (a1,…,an) ∈ ‫ ܣ‬is called a column vector of
corresponding alignment, ‫ܸ ∈ ݒ‬௜ is real node.
For readers’ ease, we keep the notation convention G
={G1(V1, E1),…,Gn(Vn, En)} to refer to the multigraph in which
the graph Gi has been added a dummy node
Definition 4. (Scoring function). The score s of a given
alignment A = (a1,…, an) is defined as in Equation 1.
n

s ( A) = ∑ ns ( a i ) +
i =1



(1)

es (a i , a j )


1≤ i < j ≤ n

where ns is the score of the fitness of the corresponding
column and is calculated by the Equation 2.

nsm

a 
nsmm
 
ns  ⋮  = ∑ 
 ai  1≤ j  m

nsdummy
i
1

l(a ij )=l(aki )
l(a ij ) ≠ l(aki )
i
j

(2)

i
k

a = ⊥ , a ≠⊥
a ij ≠⊥ , aki =⊥


and es evaluates the compatibility of the edge length and
is calculated by the Equation 3:
esmm
(aki ,akj ) ∈ Ek , (ali ,alj ) ∉ El
  a1i   a1j  

(aki ,akj ) ∉ Ek , (ali ,alj ) ∈ El
   
esmm
es  ⋮  , ⋮   = ∑ 
d klij ≤ ε
  a i   a j   1≤ k  m   m 
es
ij
 mm d kl > ε

(3)

௜௝
In Equation 3, ݀௞௟
= ห‫ݓ‬൫ܽ௞௜ ൯ − ‫ݓ‬൫ܽ௟௝ ൯ห . Parameters (nsm,
nsmm , nsdummy , esm , esmm) are reused from [21]: nsm = 1.0;
nsmm = -5.0; nsdummy = -2.5; esm = 0.2; esmm =-0.1.

Solution of an MGA problem is alignment that
maximizing the scoring function‫)ܣ(ݏ‬. This is a NP-hard
problem (see [6, 21]). If one use the exhaustive method to
solve it, the complexity will be ܱ((ܸ݉ܽ‫!)ݔ‬௡ ) where Vmax is

the number of vertices of the graph with the highest number of
vertices and n is the number of graphs.
B. Related works
Weskamp et al. [21] proposed applying multiple graph
alignment problems to study protein characteristics, where
graphs are used to approximately describe the binding pockets
Greedy algorithm. Weskamp et al. [21] first (2007)
studied the MGA problem and used it in the analysis of
protein active sites. The authors proposed a greedy algorithm,
which transforms the multiple graphs comparison into the
pair-wise comparison to find out a good enough solution
within a small amount of time.
GAVEO algorithm. Fober et al [6] proposed a genetic
algorithm called GAVEO that substantially improve efficiency
compared with the greedy algorithm proposed by Weskamp
although its runtime is higher.
ACO - MGA algorithm. The authors [20] proposed an ant
colony optimization algorithm (ACO), which uses simple
heuristics and local search techniques, which change the
position of the same label vertices of each component graph to
increase edge fitness of the objective function. This method
yields better results than GAVEO but its running time is
longer when data size is large.
ACO method. This method is proposed by Dorigo (see
[5]) in 1991, is a stochastic metaheuristics method to solve
difficult combinatorial optimization problems. In these
algorithms, the original problem is transformed into the
problem of finding the solution on a construction graph G =
(V, E, Ω, η, T), where V is the vertices set, E is the edges set,
Ω is constrain set to build the solution, η and T are vectors that

represent the heuristics and reinforcement learning
information for constructing a solution. Infomation may be
placed on the edges or on the vertices.
In each iteration, each ant in the colony of m ants will
build a solution on the construction graph. It starts from a start
vertex and develops random sequence based on reinforcement
learning information, which is represented by the pheromone
trail and the heuristics information. The random sequence
follow random walk that is fit with Ω constrain. Then the
solution is evaluated (may be additionally applied a local
search) and updated pheromone trail as reinforcement learning
information for the next step. The best-found solution will be
the solution of the problem (more details see [5]).
Memetic algorithm. The Memetic algorithms [13]
introduce local search techniques for iterative algorithms
based on population. The solutions found after each iteration
is selected upon to apply the local search techniques in a
flexible way. Thus, algorithms are efficient and take less
runtime.


To apply memetic scheme based on ACO method, there
are four factors need resolving: 1) the construction graph and
the procedure for sequentially developing according to given
constraints, 2) heuristic information, 3) pheromone update
rule, 4) the local search techniques and their usage.
III.

THE PROPOSED ALGORITHM


of vertices of the below layer. Vertices of the bottom layer are
connected to all of the vertices of the top layer. The top layer
considered as the next layer of the bottom layer. Figure 1
illustrates the construction graph where ants start from the
graph G1, which does not display connections with the bottom
layer. Round nodes are real and square nodes are dummy.

Considering the alignment problem for a set G of graphs
G ={G1(V1,E1),…,Gn(Vn,En) where each graph has added
dummy node as in definition 3 and 4. our new algorithm is an
ACO-based memetic algorithm named ACO-MGA2. It uses
the same construction graph as ACO-MGA algorithm does but
with more efficient heuristic information and local search
procedures. General framework of ACO-MGA2 is as follows.
A. General framework
After initializing parameters and m artificial ants (agents).
ACO-MGA2 repeatedly perform two stages as in Algorithm 1.
The first stage (applied for the first 70% of iterations). In
each iteration, each ant builds solutions on the construction
graph based on heuristic information and pheromone trail
intensity. Then the algorithm determines the best solution of
the iteration, updates pheromone trail according to SMMAS
rule and updates the best solution found by then.
The second stage (apply for the last 30% of iterations). In
each iteration, after ants build solutions, two local search
techniques are applied to find the best solution of iteration.
Because of the the vertex label fitness has more effect on the
objective function (Equation 1) than the edge weight fitness
does, the procedure for re-positioning vertices of different
label on alignment vectors is applied precedent. These

procedures is applied follow “The best” strategy (that is
searching from the first graph to the last graph to get the best
possible solution). Then ACO-MGA2 updates pheromone trail
according to SMMAS rule and updates the best solution.
Algorithm 1: ACO-MGA2 algorithm
Input: A set of graphs G ={G1(V1,E1),…,Gn(Vn,En)
Output: The best alignment A ⊆ (V1 ∪ {⊥}) × ... × (Vn ∪ {⊥}) for G
Begin
Initialize; // initialize pheromone trail matrix and m ants;
while (stop conditions not satisfied) do
for each a ∈ A do
Ant a build a multiple graph alignment;
Local search// run only at the second stage
Search by changing the positions of the different
label vertices;
Search by changing the positions of the same label
vertices;
Update pheromone trail follows SMMAS rule;
Update the best solution;
End while
Save the best solution;
End

B. Components of ACO-MGA2
Construction Graph
The construction graph consists of n layers where layer i
is graph Gi in the set G. Vertices of a layer are connected to all

Fig.1. Construction graph for n graphs alignment.


An alignment of graphs (by Definition 3) is a path from
G1 through every layer to Gn such that each path passes only
one vertex of each layer and each vertex of the construction
graph has only one path passes through. Dummy nodes allow
more than one paths to passes through.
Remark. Note that the paths forming this alignment can be
considered as a single path by the insight of the popular ACO
algorithm. This implied path starts from a vertex of the graph
G1 passing through all next graphs to the last graph. It then
"walks" to the vertex of the top layer of another alignment
vector until passing through all real nodes, each node exactly
once.
Pheromone trails and heuristic information

Pheromone trail intensity ߬௝,௞
on the edge connecting
vertex j of graph Gi with vertex k of the next graph is
initialized as ߬௠௔௫ and will be updated after each iteration.

Heuristic information ߟ௝,௞
(ܽ) is calculated by Equation 4.

 count ( k , a ) + 1
k is a real node

i
n (a ) = 
1

k is a dummy node

 n *Vmax
i
j ,k

(4)

Where count(k,a) is the number of vertices in vector
{a1,…ai} has the same label with label(k) of vertex k if k is
real vertex. Vmax is the vertices number of the graph with
most vertices.
Random walk procedure to construct an alignment
In each iteration, each ant will repeat the process to build
vectors a = (a1,…, an) for an alignment A as follows.


The ant randomly chooses an real vertex which is not
aligned on the construction graph as starting vertex and base
on heuristics information and pheromone trail to walk in a
randomly sequential manner (with probability given by
Equation 5) to the vertex on the next graph. For ease of
visualization, we assume this vertex is the vertex a1 of the
graph G1 and random walk along the <a1,…,ai > path to vertex
j = ai of graph Gi where it chose vertex k in Gi +1 with
probability:

p ij ,k =

(τ ij ,k )α *[η ij , k (a)]β




s∈R _ Vi +1

(τ ij ,s )α *[η ij , s (a)]β

(5)

where R_Vi are not yet aligned vertices belonging to Vi
including the dummy node.
After a vector is fully developed into a=(a1,…,an), the real
vertices in vector a is removed from the construction graph to
continue repeating the alignment procedure of ants until all
vertices have already aligned.
Note that if the first real node selected does not belong to
G1 but belongs to Gm instead, the above procedure will consist
of two processes: aligning from Gm to Gn and aligning from G1
to Gm-1.
Pheromone Update Rule
After the ants found the solutions (in the first stage) or
carried out local search (in the second stage), the pheromone
trail intensity is updated according to SMMAS pheromone
trail update rule in [4, 9], as follows:
(6)
τ ij ,k = (1 − ρ )τ ij ,k + ∆ ij ,k

(i,j,k) ∈ best solution
 ρ *τ max
(7)
∆ ij ,k = 
ρ

τ
*
(i,j,k)
∈ best solution
min

where τmax and τmin are given parameters, ρ∈ (0,1) is
parameter, best solution is the best solution found in current
iteration.
Note that in Equation 6, parameter ρ defines two
properties: reinforcement search around the best-found
solution and explore new solution. The large ρ puts emphasis
on reinforcement search, and the small ρ puts emphasis on
exploration.
Local search
Local search procedure is sequentially performed from
the graph G1 to the graph Gn by the principle stop when found
the best result. This procedure consists of two techniques:
change the position of the same label vertices and change the
position of different label vertices.
1) Swap the pairs of different label vertices: Swap the pair
of different label vertices of considered graph Gi on the
corresponding alignment vectors if that increases the number
of the same label vertices on the vector alignment.
2) Swap the pairs of same label vertices: Swap the pair of
the same label vertices of considered graph Gi on the

corresponding alignment vectors if that improves the fitness of
weights on the related edges.
If after swapped, score function is increased, the received

answer will replace the current best solution. This process is
repeated until find the best solution.
In Equation 1, the fitness of vertex labels has more effect
on the objective function more than the fitness of edge weight
does. Hence swapping the pair of difference label vertices is
priority. Therefore, for each alignment, we only swap the pair
of same label vertices after the finishing swapping the pair of
different label vertices.
Because local search procedure is time consuming, it is
only applied in the second stage when the best- found solution
is good enough.
IV.

EXPERIMENT RESULTS

Because the ACO-MGA2 is an improved version of
ACO-MGA, experiments presented here only compare ACOMGA2 with Greedy algorithm [21] and the evolutionary
algorithm GAVEO [6] with respect to the solution quality and
runtime. Experiments are performed as follows:
1) Run the algorithms on the same data sets with a
predetermined number of iterations to compare the alignment
quality and runtime.
2) Run the algorithms on the same data sets with
predetermined time to compare the quality of alignment.
Runtime is changed to assess convergence property.
Our experiments are performed on a computer with
following configuration: CPU Intel Core 2 Duo 2.5Ghz, RAM
DDR2 3GB and Windows 7 operating system. Parameters are
set as follows:
• The number of ants at each iteration is 30

• ρ1=0.3, ρ2=0.7, ߙ = ߚ = 1
• τmax = 1.0 and τmin = τmax/(n2*Vmax2), where n is the
number of graphs, Vmax is the number of vertices
of the graph with most vertices.
• Local search procedure is appied in the last 30% of
iterations.
A. Effect and Runtime comparisons
The empirical data consists of 74 structures generated
from Cavbase database. Each structure represents a protein
cavity belonging to protein family of thermolysin, bacteria
protease commonly used in analysis of protein and annotated
with the EC number 3.4.24.27 in the ENZYME database [5].
In this data set, each graph generated has 42 to 94
vertices. From the 74 structures, the graphs are selected to
generate random data sets consisting of 4, 8, 16, 32 graphs. To
compare the solution quality of algorithms, we performed each
algorithm on each data set 20 times and took the average
values for comparison.
The score and the runtime of the algorithms are shown in
Table 1.


Table 1. Comparison of the score and runtime with the data sets consisting of
4, 8, 16 and 32 graphs
Method/Number
of graphs

4

8


16

32

Score

-4098

-11827

-56861

-267004

Time

22

53

313

1333

Score

-1099.6

-2688.1


-10268.3

-82250

Greedy

GAVEO
ACOMGA2

Time

756

1804

7155

16378

Score

-971.8

-2277.8

-7857.2

-53960.1


Time

272

1374

4151

18005

Remark: The experimental results in Table 1 show that:
• Greedy algorithm runs much faster than the ACOMGA2 algorithm and GAVEO, but its solution
quality is too low.
• ACO-MGA2 algorithm in any case has better
solution quality. Especially when increasing the
number of graphs, the outperformance of ACOMGA2overGAVEO is more prominent. When
comparing in terms of runtime, the ACO-MGA2
algorithm also gets better results than the GAVEO
does.

Score

B. Comparing GAVEO and ACO-MGA2 under a
predetermined amount of time.
Because the greedy method require small runtime and its
solution quality is very low, in this section, we only compare
the solution quality of GAVEO and the solution quality of
ACO-MGA2 in the same runtime.
We run GAVEO and ACO-MGA2 algorithms on a data
set of 16 graphs, each graph contains 45 to 94 vertices, with

the runtime increase from 1000s to the 6000s. The results are
shown in chart in Figure 2.
0
-5000
-10000
-15000
-20000
-25000
-30000
-35000
-40000
-45000
-50000
1000 2000 3000 4000 5000 6000
Time (s)
NACOMGA -1719 -1719 -1532 -8607 -8238 -7432
GAVEO

V.

MGA problem is a new approach to structural analysis of
biological molecules, until now there are three algorithms
introduced to solve it. Greedy algorithm is a heuristic
algorithm so it is exceptional in runtime but its solution
quality is not good. The newly proposed algorithm ACOMGA2 is an improvement version of ACO-MGA.
Experiments showed its outstanding efficiency compared with
GAVEO algorithm with respect to both solution quality and
runtime.
As well as the other ACO-based algorithms, ACO-MGA2
could be easily implemented as parallel to work with the large

number of graphs.
ACKNOWLEDGMENT
This work was done during the stay of the authors in
Vietnamese institute for advanced study in mathematics
(VIASM)
REFERENCES
[1]
[2]

[3]

[4]

[5]
[6]

[7]

[8]

[9]

[10]

[11]

-2997 -2627 -1642 -1439 -1200 -1131

Fig.2. Comparison of results of ACO-MGA2 algorithm and GAVEO
algorithm with data set of 16 graphs when runtime increase from 1000s to

6000s.

Remark: Chart in Figure 2 show that when the time amout
increases from 1000s to 6000s solution quality of ACOMGA2 algorithm always is better than GAVEO algorithm.

CONCLUSIONS

[12]

[13]
[14]

Aladag, A.E. and Erten, C. (2013) “SPINAL: scalable protein interaction
network alignment,” Bioinformatics, 29, 917–924.
Conte, P. Foggia, C. Sansone, and M. Vento (2004), Thirty Years of
Graph Matching in Pattern Recognition,”Int’l J. Pattern Recognition and
Artificial Intelligence, vol. 18, no. 3, pp. 265-298,.
O. Dror, H. Benyamini, R. Nussinov, and H. Wolfson (2003), “MASS:
Multiple
Structural
Alignment
by Secondary Structures,”
Bioinformatics, Vol. 19 No.1, 95-104.
Do Duc, H. Q. Dinh, and H. Hoang Xuan , (2008) “On the Pheromone
Update Rules of Ant Colony Optimization Approaches for the Job Shop
Scheduling Problem,” 11th Pacific Rim International Conference on
Multi-Agents, PRIMA 2008, Hanoi, Vietnam (LNCS), pp. 153-160,
December 15-16
M. Dorigo, and T. Stutzle, Ant Colony Optimization. The MIT Press,
Cambridge, Masachusetts (2004)

T. Fober, M. Mernberger, G. Klebe and E. Hullermeier (2009),
“Evolutionary Construction of Multiple Graph Alignments for the
Structural Analysis of Biomolecules,” Bioinformatics vol. 25, No.16,
2110-2117.
J. F. Gibrat, T. Madej and S. H. Bryant (1996), “Surprising similarities
in structure comparison,” Current Opinion in Structural Biology, Vol. 6,
No. 3, 377-385.
M. Hendlich, A. Bergner, J. Günther, and G. Klebe, “Relibase:Design
and Development of a Database for Comprehensive Analysis of ProteinLigand Interactions,” J. Molecular Biology, vol. 326, pp. 607-620, 2003.
H. Hoang Xuan, T. Nguyen Linh, D. Do Duc, H. Huu Tue, “Solving the
Traveling Salesman Problem with Ant Colony Optimization: A Revisit
and New Efficient Algorithms,” REV Journal on Electronics and
Communications, Vol. 2, No. 3–4, July – December,2012, 121-129
K. Kinoshita and H. Nakamura, (2005), “Identication of the Ligand
Binding Sites on the Molecular Surface of Proteins”. Protein Science,
Vol. 14, No. 3, 711-718.
Kuchaiev,O. and Przulj,N. (2011) Integrative network alignment reveals
large regions of global network similarity in yeast and human.
Bioinformatics, 27, 1390–1396.
M.
Meenberger,
G.
Klebe
andE.
Hullermaer
(2009),
“SEGA:Semiglobal Graph Alignment for Structure-Bases Protein
Comperison,” IEEE/ACM Trans. on Computational Biology and
Informatics, Vol 8, No 5, 1330-1342
Neri, C. Cotta, P. Moscato o, Handbook of Memetic algorithms,

Springer, 2012.
S. Schmitt, D. Kuhn, and G. Klebe, “A New Method to Detect Related
Function among Proteins Independent of Sequence and Fold
Homology,” J. Molecular Biology, vol. 323, no. 2, pp. 387-406, 2002.


[15] D. Shasha, J. Wang, and R. Giugno (2002), “Algorithmics and
Applications of Tree and Graph Searching,” Proc. 21th ACM SIGMODSIGACT-SIGART Symposium on Principles of Database Systems,
ACM Press New York, USA, 39-52.
[16] M. Shatsky, R. Nussinov and H. Wolfson (2004), “A Method for
Simultaneous Alignment of Multiple Protein Structures,” Proteins
Structure Function and Bioinformatics, Vol. 56, No. 1, 143-156.
[17] M. Shatsky, A. Shulman-Peleg, R. Nussinov, and H. J. Wolfson (2006),
“The multiple common point set problem and its application to molecule
binding pattern detection,” Journal of Computational Biology, Vol. 13,
No. 2, 407-428.
[18] R. Spriggs, P. Artymiuk, P. and Willett (2003), “Searching for Patterns
of Amino Acids in 3D Protein Structures.” J. of Chem. Inform. and
Comp. Sciences, Vol. 43, No. 2, 412-421.
[19] J. D.Thompson, D. G. Higgins and T. J. Gibson (1994). “Clustal W:
improving the sensitivity of progressive multiple sequence alignment
through sequence weighting, position-specic gap penalties and weight
matrix choice,” Nucleic Acids Research, Vol. 22, 4673-4680.

[20] Tran Ngoc Ha, Do Duc Dong, Hoang Xuan Huan, “An Efficient Ant
Colony Optimization Algorithm for Multiple Graph Alignment,”
Proceedings of the international conference on Computing, Management
and Telecommunications, 2013, 386 - 391
[21] N. Weskamp, E. Hullermeier, D. Kuhn and G. Klebe (2007), “Multiple
Graph Alignment for the Structural Analysis of Protein Active Sites,”

IEEE/ACM Trans. Comput. Biol. Bioinform. vol.4 No.2, 2007, 310-320.
[22] X. Yan, P. Yu and J. Han (2005), “Substructure Similarity Search in
Graph Databases,” Proc. of ACM SIGMOD Int. Conf. on Management
of Data, New York, 766-777.
[23] X. Yan, F. Zhu, J. Han, and P. Yu (2006), “Searching Substructures with
Superimposed Distance,” Proc. of International Conference on Data
Engineering, 88-88.
[24] S. Zhang, M. Hu, and J. Yang (2007). “Treepi: A novel graph indexing
method,” Proc. of 23th International Conference on Data Engineering,
966-975.



×