Tải bản đầy đủ (.pdf) (5 trang)

an incremental learning algorithm based on support vector domain classifier

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1.72 MB, 5 trang )

An
Incremental
Learning
Algorithm
Based
on
Support
Vector
Domain
Classifier
Yinggang
Zhao,
Qinming
He
College
of
Computer
Science,
Zhejiang
University,
Hangzhou
310027,
China
Email:
ygzl29g163.com
SVDD
algorithm
gives
us
an
enlightenment:


when
we
classify
a
binary-class
dataset,
if
we
only
know
part
of
sample's
Abstract
category
(
for
example,
samples
with
category
label
yi
=
1),
yet
the
other
part
of

sample's
category
is
unknown,
then
we
Incremental
learning
technique
is
usually
used
to
solve
can
design
new
type
of
classifier
based
on
SVDD
named
large-scale
problem.
We
firstly
gave
a

modif
ed
support
vector
support
vector
domain
classifier
(
SVDC).
This
new
classifier
machine
(SVM)
classification
method

support
vector
only
need
to
describe
the
data
with
known
category,
then

domain
classifer
(SVDC),
then
an
incremental
learning
obtaining
the
description
boundary
of
this
class
of
data.
algorithm
based
on
SVDC
was
proposed.
The
basic
idea
of
Finally,
we
can
classify

the
unknown
binary-class
data
this
incremental
algorithm
is
to
obtain
the
initial
target
according
to
the
obtained
boundary.
concepts
using
SVDC
during
the
training
procedure
and
then
In
this
paper

our
incremental
learning
algorithm
is
based
update
these
target
concepts
by
an
updating
model.
Difierent
on
SVDC,
and
this
algorithm
is
motivated
by
the
from
the
existed
incremental
learning
approaches,

in
our
person-learning
procedure.
When
learning
a
complicated
algorithm,
the
model
updating
procedure
equals
to
solve
a
concept,
people
usually
obtain
a
initial
concept
by
using
part
quadratic
programming
(QP)

problem,
and
the
updated
model
of
useful
information,
then
update
the
obtained
concept
by
still
owns
the
property
of
spars
solution.
Compared
with
other
utilizing
new
information.
In
term
of

our
incremental
existed
incremental
learningalgorithms,
the
inverse
procedure
algorithm
based
on
SVDC,
it
firstly
utilize
part
of
data
of
our
algorithm
(i.e.
decreasing
learning)
is
easy
to
conduct
(
memory

space
permitting),
then
obtain
a
concept
(namely
the
without
extra
computation.
Experiment
results
show
our
parameter
of
obtained
decision
hypersurface)
by
SVDC
algorithm
is
effective
andfeasible.
learning
algorithm,
finally
according

to
the
information
of
decision
hypersurface
acquired
in
last
step,
update
the
parameter
of
decision
hypersurface
gained
in
last
step
utilizing
Keywords:
Support
Vector
Machines,
Support
Vector
Domain
specialized
updating

model
in
the
process
of
incremental
Classifier,
Incremental
learning,
Classification.
learning,
namely
updating
the
known
concept.
Our
algorithm
owns
the
following
characters:
1.
INTRODUCTION
1)
The
incremental
updating
model
in

this
algorithm
With
large
amounts
of
data
available
to
machine
learning
has
a
similar
mathematics
form
compared
with
community,
the
need
to
design
techniques
that
scale
well
is
standard
SVDC

algorithm,
and
any
algorithm
used
more
critical
than
before.
As
some
data
may
be
collected
over
to
obtain
the
standard
SVDC
can
also
be
used
to
long
periods,
there
is

also
a
continuous
need
to
incorporate
the
obtain
the
updating
model
of
our
algorithm;
new
data
into
the
previously
learned
concept.
Incremental
2)
The
inverse
procedure
of
this
algorithm,
i.e.

the
learning
techniques
can
satisfy
the
need
for
both
the
scalability
decreasing
learning
procedure
is
easy
to
and
incremental
update.
implement,
that
is
to
say
when
we
perceive
the
Support

vector
machine
(SVM)
is
based
on
statistical
generalization
performance
dropped
in
the
learning
theory,
which
has
developed
over
last
three
decades
incremental
process,
we
can
easily
return
last
step
[1,2].

It
has
been
proven
very
successful
in
many
applications
without
extra
computation;
[3,4,5,6].
SVM
is
a
supervised
binary-class
classifier,
when
The
experimental
results
show
the
learning
performance
we
train
samples

using
SVM,
the
categories
of
the
samples
are
of
this
algorithm
approaches
that
of
batch
training,
and
needed
to
be
known.
However,
in
many
cases,
it
is
rare
that
performance

well
in
large-scale
dataset
compared
to
other
we
can
obtain
the
data
with
their
category
be
known,
in
other
SVDC
incremental
learning
algorithm.
words,
most
of
the
obtained
data's
categories

are
unknown.
In
The
rest
of
this
paper
is
organized
as
follows.
In
section
2
this
situation,
traditional
SVM
isn't
appropriate.
we
give
an
introduction
of
SVDC,
and
in
section

3
we
present
TAX
et
al
proposed
a
method
for
data
domain
description
our
incremental
algorithm.
Experimental
and
results
called
support
vector
domain
description
(SVDD)
[7],
and
it
is
concerning

the
proposed
algorithm
are
offered
in
Section
4.
used
to
describe
data
domain
and
delete
outliers.
The
key
idea
Section
5
collects
the
main
conclusions.
Of
SVDD
is
to
describe

one
class
of
data
by
finding
a
sphere
with
minimum
volume,
which
contains
this
class
of
data.
Proc.
5th
IEEE
Int.
Conf.
on
Cognitive
Informatics
(ICCI'06)
Y.Y.
Yao,
Z.Z.
Shi,

Y.
Wang,
and
W.
Kinsner
(Eds.)80
1
-4244-0475-4/06/$20.OO
@2006
IEEE80
2.
Support
Vector
Domain
Classifier
with
constrains
,
=
=1,
and
0
<
a,
<
C.
Where
the
2.1
Support

Vector
Domain
Description
[7]
inner
product
has
been
replaced
with
kernel
function
K(.,.),
and
K(.,.)
is
a
definite
kernel
satisfying
mercer
Of
a
data
setcontaiing
N
dataobj
condition,
for
example

a
popular
choice
is
the
Gaussian
Of
a
data
set
containing
N
data
objects,
enl
(,)=ep-xz2/2),>0
f
x,
Z
=
1, ,~
NJ}
a
description
iS
required.
We
try
to
find

a
kre:Kxz=pJ1X_12
221
a>.
{xs,
i
nd
1.,}ac
dscp
requre
e
W
wtr
tindma
To
determine
whether
a
test
point
is
z
within
the
closed
and
compact
sphere
area
Q

with
minimum
sphere,
the
distance
to
the
center
of
the
sphere
has
to
be
volume,
which
contain
all
(or
most
of)
the
needed
objects
calculated.
A
test
object
z
accepted

when
this
distance
is
Q,
and
the
outliers
are
outside
Q.
Figure
1
shows
the
small
than
the
radius,
i.e.,
when
(z
-
a)T
(z
-a)
<
R2.
sketch
of

Support
Vector
Domain
Description
(SVDD).
Expressing
the
center
of
the
sphere
in
term
of
the
support
support
vector
vector,
we
accept
objects
when
Z-a
2
=
K(z,z) 2
aiK(x
z)+ZEaiaK(x1,xj)
R2

ij
o
utliers
ag
6jc(5
+
cassiication
bo.urdary
2.2
Support
Vector
Domain
Classifier
0O
'*
+
+

:
Inspired
by
SVDD,
in
this
section
we
extend
SVDD
to
0

++
*-
+
+
+
la
o
SVDC
situation.
Consider
a
training
set
of
instance-label
pairs
(xi,
yi1),
i
=
1,
2,
l,
l
+
1, N,
where
xi
c
R'

and
0
+
Fig.l.
SVDD
classifier
i
+
This
is
very
sensitive
to
the
most
outlying
object
in
the
Now
we
construct
a
hyper-sphere
for
samples
of
target
objects.
When

one
or
a
few
very
remote
objects
are
Yi
=
1,
and
the
samples
of
yi
=
-1
are
not
considered,
in
the
training
set,
a
very
large
sphere
is

obtained
which
then
we
can
get
the
following
quadratic
optimization
will
not
represent
the
data
very
well.
Therefore,
we
allow
for
some
data
points
outside the
sphere
and
introduce
slack
problem

variable
Si
min[R2
+C
]i
I
Of
the
sphere,
described
by
center
a
and
radius
R,
we
i=1
(6)
minimize
the
radius
s.t.
yi
(R2
-(xi
-
a)T
(xi
-a))

>i
=1,
minIIR2
+
CZ
]
(1)
where
Si
.
0,yi
=
1,
and
C
is
a
constant.
Similarly,
where
the
C
is
a
penalty
constant
which
gives
the
using

multipliersi
a>
0,
,i
>
0,
we
introduce
Lagrangian
trade-off
between
simplicity
(
or
volume
of
the
sphere)
and
/
/
the
number
of
errors
(number
of
target
objects
rejected).

L(R,
a,a>,)=R2
+aC,,
-ocy'{R2
-(x
-a)T(x
-a)}-
/3X
This
had
to
be
minimized
under
the
constraints
(7)
(xi-a)
(xi
-
a)
<
R
+
d
Vi
d
2
0
(2)

and
in
formula
(7),
set
the
derivatives
with
respect
to
the
Incorporating
these
constraints
in
(1),
we
construct
the
primalvariables
R,a,j
equaltozero,andre-substituting
Lagrangian
the
obtained
results
to
(7)
yielding
L(R,

a,
a,)
R2
+
C4
f
-Eca{R2
+
f
-(x
-2ax
+
a2)}-
E
/
/
1
1W(o)
a=
Eo,yaK(x,yx)
-
E
a
iajy1yK(xi,xj)
(3)
i=1
i
j=l
(8)
withLagrange

multipliers
a
2>
,
fi
.0.
(
s.t.
5£rYa
1
<r<
Solving
minimal
solution
of
formula
(3)
can
transform
sl-l
=
1
0
<
a
<C
to
solve
the
maximal

solution
of
its
dual
problem
Te
ecndsg
h
iaycasshr-tutr
L
(a)
=
Z
aiK(x1,x1)
-
ZananK(x1,x
j)
(4)
SVM
classifier
i
i
806
f(x)
=sgn(R2
-
K(x,
x)
+
2Z

ofyK(x,
x)
-
of
caayy
YK(x,
X1))
where
a1k,
,f
.
0,
(i
1, ,
lk)
is
Lagrangian
multiplier.
(9)
According
to
optimization
solution
conditions,
we
can
where
further
get
the

following
equation:
k{kyk=1e{S
(K(Xk,Xi)
a2yiyIK(xkxi)
yK(xxj))
=Rk
-Rk
I-2?a'y>
R
0
Rk=RkR
+
Yk''
k
k
k-I
kY
k-
L akYkXk
(I10)
(13)
in
formula
(10),
xk
represents
support
vector,
and

k
is
Finally
we
obtain
the
following
decision
function:
the
number
of
support
vector.
fk(x)
=sgntRk
-{K(x,x)
+2
E
a,y,K(x,X)
-ZE
a,ayjy,yjK(x,ix)}
If
f(x)
>
0,
the
tested
sample
is

contained
in
sphere,
,ESV
,ESV
and
we
look
the
samples
enclosed
I
sphere
the
same-class
sgn{R21
+
2Rkl
E
aoy1xi
+(
E
aciyiXi)2}
objects.
Otherwise
it
is
rejected,
and
we

look
it
as
the
Xi,SVk
xi,SVk
opposite
objects.
-{K(x,
x)
+
2
E
a1yiK(x,
xi)-E
aa1jy1yjK(x
,
xj)}
xi
ESV
xiESV
3.
SVDC
Incremental
Learning
Algorithm
According
formula
(6),
we

suppose
the
obtained
initial
sgn{ffk
(x)
+
2Rk
L
E
aiy,x,
+
(
a
ciyixi)2}
parameter
(sphere
radius)
learning
with
initial
training
set
is
xi
csVk
xi
csVk
RO,
and

the
set
of
support
vectors
is
SVO
.
The
parameter
(14)
From
equation
(14)
we
can
see
it
is
easy
to
return
the
becomes
Rk
in
the
kth
incremental
learning,

and
the
set
last
step
of
incremental
earning
without
extra
computation.
of
support
vectors
becomes
SVk,
and
the
new
dataset
in
From
the
above
analysis
we
can
see
only
conduct

a
trifling
modification
on
the
standard
SVDC,
can
it
be
used
klh
step
becomes
Dk
=
{(xk
yk)j}l-
to
solve
the
updated
model
in
incremental
learning
procedure.
Our
incremental
algorithm

can
be
described
as
Nowwesummarizeouralgorithmasfollowings:
following:
Step
1
Learning
the
initial
concept:
training
SVDC
Assume
we
has
known
Rkl
updating
the
current
using
initial
datasetoTS
,
then
parameter
R0
is

model~~~~
~
~
usn
SVkn
lnXka
daae
TSo
I/hnpaaeerR
model
using
SJK,l1
and
new
dataset
{(X
iY7)}>=1
obtained;
We
updating
the
current
model
using
the
following
Step
2
Updating
the

current
concept:
when
the
new
data
are
available,
using
them
to
solve
QP
problem
quadratic
programming
(QP)
problem:
formula
(
11),
and
obtain
new
concept;
min
g(Rk)
I
Rk
-

R
112
Step
3
Repeating
step
2
until
the
incremental
learning
is
k
(Rk2
_(Xk
-
a)'
(XV
-a))
>
Xk
exi
Dk
over
where
Rk-l
is
the
radius
of

last
optimization
problem
(11),
4.
Experiments
and
Results
when
k
=
1,
Ro
is
the
radius
of
standard
SVDC.
It
is
In
order
to
evaluate
the
learning
performance
offered
by

obvious,
when
RklI
=
0,
the
incremental
SVDC
has
the
our
incremental
algorithm,
we
conducted
experiment
on
six
different
datasets
taken
from
UCI
Machine
Repository:
same
form
as
the
standard

SVDC.
We
will
found
the
Banana,
Diabetes,
Flare-Solar,
Heart,
Breast-Cancer,
German.
updated
model
by
the
incremental
SVDC
also
owns
the
Note
some
of
then
are
not
binary
-class
classification
problems,

but
we
have
transform
them
to
binary-class
problem
by
special
property
of
solution
sparsity
which
is
owned
by
the
technique.
Experiment
parameters
and
Dataset
are
shown
in
standard
SVDC
table

1.
For
notation
simplicity,
in
figure
2,
our
algorithm
was
abbreviate
as
Our
ISVM.
In
order
to
solve
(11),
we
transform
it
to
its
dual
The
experiment
parameters
are
listed

in
table
1.
In
addition
to
conducting
experiments
with
our
algorithm,
we
problem,
and
introduce
Lagrangian:
also
implemented
and
tested
another
popular
and
effective
L='
R
-
R
-
-

(k-
-
-
incremental
learning
algorithm
ISVM
[8][9]
on
the
same
2
a)(xk
L=J'k"k
datasets
so
that
compare
their
learning
performance
in
our
(12)
experiment
we
choose
RBF
K(x,y)=exp(
2

)
as
kernel
807
function,
and
the
kernel
width
o
is
not
fixed.
The
MATLAB
100
Cancer
SVM
toolbox
contributed
by
Gunn
[10]
was
used
in
our
95
T
ISVM

experiment,
and
the
experiment
software
and
hardware
90
environment
were:
JIntel
P4
PC(1.4GHz
CPU,
256MB
85
Memory),
WindowsXP
Operation
System
2
80
70
Table
1.
Data
set
and
experiment
parameters

65
Dataset
#TRS
#TES
#ATT
C
07
60
55
Banana
400
4900
2
3.162e+
1.OOOe+00
l_
02
50,I1
2
3
Incremental
Learning
Step
Breast-
200
77
9
1.519e+
5.000e+01
(b)

Cancer
01
Diabetes
468 300
8
l.OOOe+
2.000e+01
100
Diabetes
ISVM
01
95

Our
ISVM
Flare-
666
400
9
1.023e+
3.000e+00
90
Solar
01
85
German
700 300
20
3.162e+
5.500e+01

80
00
Heart
170
100
13
3.162e+
1.200e+02
-0
-
==
70
00
In
table
1,
the
#TRS
represents
the
number
of
training
65
samples,
#TES
represents
the
number
of

testing
samples,
60
#ATT
represents
the
number
of
attributes.
C
is
penalty
55
constant,
o-
is
the
kernel
width.
1
2
3 4
5 6
7
8
9
10
Literature
[8]
points

out
an
efficient
incremental
(ncremental
Learning
Step
learning
algorithm
should
satisfies
the
following
three
(c)
Flare-Solar
criterions:
100
A.
Stability:
When
each
step
of
incremental
learning
is
95
Our
ISVM

over,
the
predication
accuracy
on
the
test
should
not
vary
90
too
obviously;
85
B.
Improvement:
With
the
performing
of
the
80
75
incremental
learning,
the
algorithm
s
predication
accuracy

should
improve
gradually;
r_0
C.
Recoverability:
The
incremental
learning
algorithm
should
own
the
ability
of
performance
recoverability,
that
is
to
say
when
the
learning
performance
l
of
the
algorithm
descends

after
a
certain
step
learning,
the
Incremental
Learning
Step
algorithm
can
recovers
even
surpasses
the
former
(d)
performance
in
the
later
learning
procedure.
German
Figure
2
shows
the
experiment
results

of
the
two
'°°
l,vm
different
incremental
learning
algorithms.
90
~~~~~~~~~~~~~~~~~~~~~~~~~~~~9
Banana
100~~~~0
100
r
1
T
1 T r
a
~~~~~~
~~ISVM
85-
95
-*
OrIVMa
rl~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~7
-55
85~~~~~~~~~~~~~~~~~
2
07

8 9 1
55
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Incremental
Learning
Step
, 1 2 3
4
5
6
7
8
9
1
0
Ve,~~~~~~~~~~~~~~~~~~~~~~5
Incem~ental
Learing
Step
(a)
808
Heart
[3]
T.
Joachims.:
Text
categorization
with
support
vector
machines:

100
ISVM
learing
with
many
relevant
features,
Proceedings
of
the
European
Conference
on
Machine
Learning,
Springer,
Berlin,
1998,
pp.
90
137-142
85
o-° '~=0e
80
/
,<<<
[4]
S.
Tong.,
E.,

Chang,.:
Support
Vector
Machine
Active
Learning
75
for
Image
Retrieval.Proceedings
of
ACM
International
iEi
70
/
,,"Conference
on
Multimedia,
2000,
pp
107-118.
65
,
[5]
Yang
Deng
.
et
al.

A
new
method
in
data
mining

support
55
vector
machines.
Beijing:
Science
Press,
2004.
1
2
3
4 5
6
7 8 9
10
[6]
L.
Baoqing.
Distance-based
selection
of
potential
support

vector
Incremental
Learning
Step
by
kernel
matrix.
In
International
symposium
on
Neural
(f)
Networks
2004,
LNCS
3173,pp.
468-473,2004
Fig.
2.
Performance
of
two
incremental
learning
algorithms
[7]
D.
Tax.:
One-class

classification.
Ph
D
thesis,
Delft
University
of
From
figure
2
we
can
see
after
each
step
of
incremental
Technology,
htp://www.phtn.tudelft.nl/-davidt/thesispdf
(2001)
training,
the
variation
of
the
predication
accuracy
on
the

test
set
is
not
various,
which
satisfy
the
requirement
of
algorithm
[8]
N
A
Syed,
H
Liu,
K
Sung.
From
incremental
learning
to
model
stability.,
and
we
can
discovery
the

algorithm
improvement
is
independent
instance
selection
-
a
support
vector
machine
gradually
improved
and
algorithm
and
the
algorithm
own
the
approach,
Technical
Report,
TRA9/99,
NUS,
1999
ability
of
performance
recoverability.

So
our
incremental
ablgoithmo
perfopo
ned
inrthisoperabmeets
the
duriremand
l
o
[9]
L
Yangguang,
C
Qi,
T
yongchuan
et
al.
Incremental
updating
method
for
support
vector
machine,
Apweb2004,
LNCS
3007,

incremental
learnig.
pp.
426-435,
2004.
The
experiment
results
show,
our
algorithm
has
the
similar
learning
performance
compared
with
the
popular
[10]
S
R
Gunn.
Support
vector
machines
for
classification
and

ISVM
algorithm
presented
in
[9].
Another
discovery
in
our
regression.
Technical
Report,
Inage
Speech
and
Intelligent
experiment
is
with
the
gradually
performing
of
our
Systems
Research
Group,
University
of
Southampton,

1997
incremental
learning
algorithm,
the
improvement
of
learning
performance
become
less
and
less,
and
at
last
,
the
learning
performance
no
longer
improve.
It
indicates
that
we
can
estimate
the

needed
number
of
samples
required
in
problem
description
by
using
this
character.
5.
Conclusion
In
this
paper
we
proposed
an
incremental
learning
algorithm
based
on
support
vector
domain
classifier
(SVDC),

and
its
key
idea
is
to
obtain
the
initial
concept
using
standard
SVDC,
then
using
the
updating
technique
presented
in
this
paper,
in
fact
which
equals
to
solve
a
QP

problem
similar
to
that
existing
in
standard
SVDC
algorithm
solving.
Experiments
show
that
our
algorithm
is
effective
and
promising.
Others
characters
of
this
algorithm
include:
updating
model
has
similar
mathematics

form
compared
with
standard
SVDC,
and
we
can
acquire
the
sparsity
expression
of
its
solutions,
meanwhile
using
this
algorithm
can
return
last
step
without
extra
computation,
furthermore,
this
algorithm
can

be
used
to
estimate
the
needed
number
of
samples
required
in
problem
description
REFERENCES
[1]
C.
Cortes,
V.
N.
Vapnik.:
Support
vector
networks,
Mach.
Learn.
20
(1995)
pp.
273-297.
[2]

.V.
N.
Vapnik.:
Statistical
learning
Theory,
Wiley,
New
York,
1998.
809

×