Tải bản đầy đủ (.pdf) (5 trang)

improved algorithm for adaboost with svm base classifiers

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (440.17 KB, 5 trang )

IMPROVED
ALGORITHM
FOR
ADABOOST
WITH
SvM
BASE
CLASSIFIERS
Xiaodan
WANG,
Chongming
WU,
Chunying
ZHENG,
Wei
WANG
Department
of
Computer
Engineering,
Air
Force
Engineering
University
afeu_wgyahoo.com.cn
Abstract
the
margin[5],
which
could
enhance


AdaBoost's
The
relation
between
the
performance
of
AdaBoost
generalization
capability.
and
the
performance
of
base
classifiers
was
analyzed,
Support
vector
machine[6]
was
developed
from
the
and
the
approach
of
improving

the
classification
theory
of
structural
risk
minimization.
By
using
a
kernel
performance
of
AdaBoostSVM
was
studied
There
is
trick
to
map
the
training
samples
from
an
input
space
to
a

inconsistency
existed
between
the
accuracy
and
diversity
high
dimensional
feature
space,
SVM
finds
an
optimal
of
base
classifiers,
and
the
inconsistency
affect
separating
hyperplane
in
the
feature
space
and
uses

a
generalization
performance
of
the
algorithm.
A
new
regularization
parameter,
C,
to
balance
its
model
variable
c-AdaBoostSVM
was
proposed
by
adjusting
the
complexity
and
training
error.
kernelfunction
parameter
of
the

base
classifier
based
on
How
about
the
generalization
performance
of
using
the
distribution
of
training
samples.
The
proposed
SVM
as
the
base
learner
of
AdaBoost?
Does
this
algorithm
improves
the

classification
performance
by
AdaBoost
have
some
advantages
over
the
existing
ones?
making
a
balance
between
the
accuracy
and
diversity
of
Also,
compared
with
using
a
single
SVM,
what
is
the

base
classifiers.
Experimental
results
indicate
the
benefit
of
using
this
AdaBoost
which
is
a
combination
of
effectiveness
of
the
proposed
algorithm.
multiple
SVMs?
These
are
the
attractive
research
issues
in

recent
years[7]
[8][9]
[
1O0].
Keywords:
Support
Vector
Machine;
AdaBoost.
After
analyzing
the
relation
between
the
performance
of
AdaBoost
and
the
performance
of
base
classifiers,
the
approach
of
improving
the

classification
performance
of
1.
INTRODUCTION
AdaBoost
with
SVM
base
classifiers
was
studied
in
this
paper.
A
new
variable
a-AdaBoostSVM
was
proposed
by
Boosting
is
the
machine-learning
method
that
working
adjusting

the
kernel
function
parameter
of
the
base
in
Valiant's
PAC
(probably
approximately
correct)
classifier
based
on
the
distribution
of
training
samples.
learning
model[1].
A
"weak"
learning
algorithm
that
The
proposed

algorithm
improves
the
classification
performs
just
slightly
better
than
random
guessing
in
the
performance
by
making
a
balance
between
the
accuracy
PAC
model
can
be
"boosted"
into
an
arbitrarily
accurate

and
diversity
of
base
classifiers.
Experimental
results
for
"strong"
learning
algorithm.
Schapire[2]
came
up
with
the
benchmark
dataset
indicate
the
effectiveness
of
the
the
first
provable
polynomial-time
boosting
algorithm.
proposed

algorithm.
Freund[3]
developed
a
much
more
efficient
boosting
algorithm
which,
although
optimal
in
a
certain
sense,
nevertheless
suffered
from
certain
practical
drawbacks.
2.
ADABOOST
The
AdaBoost
algorithm,
introduced
by
Freund

and
Schapire[4],
solved
many
of
the
practical
difficulties
of
Given
a
set
of
training
samples,
{(x1,y1), ,
the
earlier
boosting
algorithms,
and
can
be
easily
used
in
(x
,y,)}
,
where

each
training
sample
xi
belongs
to
solving
practical
problems.
AdaBoost[4]
creates
a
.
.
~~some
domain
or
instance
space
X,
and
each
class
label
collection
of
weak
learners
by
maintaining

a
set
of
.
.
weights
over
training
samples
and
adjusting
these
ylsi
s
i
label
set
Y
x
ase
X,ayinY={-1,+1}t
weights
after
each
weak
learning
cycle
adaptively:
the
AdaBoost

calls
a
given
weak
or
base
learning
algorithm
weights
of
the
samples
which
are
misclassified
by
current
repeatedly
in
a
series
of
rounds
t
=
1,
,T
.
One
of

the
weak
learner will
be
increased
while
the
weights
of
the
main
ideas
of
the
algorithm
is
to
maintain
a
distribution
samples
which
are
correctly
classified
will
be
decreased.
or
set

of
weights
over
the
training
set.
The
weight
of
this
The
success
of
AdaBoost
can
be
explained
as
enlarging
distribution
on
training
example
xi
on
round
t
is
Proc.
5th

IEEE
Int.
Conf.
on
Cognitive
Informatics
(ICCI'06)
Y.Y.
Yao,
Z.Z.
Shi,
Y.
Wang,
and
W.
Kinsner
(Eds.)94
1
-4244-0475-4/06/$20.OO
@2006
IEEE94
denoted
w,
(i),
i.e.
w,
(i)
is
the
weight

of
sample
xi
at
3.
Do
for
t
=
I
,T
the
iteration
round
t
.
Initially,
all
weights
are
set
(1)
Train
the
base
classifier
Ct
on
the
weighted

equally,
but
on
each
round,
the
weights
of
incorrectly
training
sample
set;
Alternatively,
a
subset
of
classified
examples
are
increased
so
that
the
base
learner
the
training
examples
can
be

sampled
is
forced
to
focus
on
the
hard
examples
in
the
training
according
to
wt,
and
these
resampled
set.
examples
are
used
to
train
the
base
learner
The
base
learner's

job
is
to
find
a
base
classifier
Ct,
C1.
The
decision
function
of
C
is
ht.
ht
is
the
decision
function
of
base
classifier
Ct,
and
(2)
Calculate
the
training

error
e
of
C:
ht(xi)
gives
the
class label
(+1
or
-1)
of
the
training
n
sample
xi,
In
the
simplest
case,
the
range
of
each
ht
is
8t
=
W,

(i),
Y1
.
'1
(xi)
binary,
i.e.,
restricted
to
{-1,
+1
};
the
base
learner's
job
(3)
Set
weight
of
base
classifier
C,:
then
is
to
minimize
the
error:
£,

=
Pri-
[h,(xi)
.
y]=
w,(i)
(1)
'=
-n(
t)
t
~
~ ~
ih()y2
e
Notice
that
the
error
is
measured
with
respect
to
the
(4)
Update
training
samples'
weights:

distribution
w,
on
which
the
base
learner
was
trained.
In
Wt+i
w,
(i)
exp{-at,yh,
(xi)}
practice,
the
base
learner
may
be
an
algorithm
that
can
Zt
use
the
weights
wt

on
the
training
examples.
wt
(i)
Fe
if
yi
=ht
(xi)
Alternatively,
when
this
is
not
possible,
a
subset
of
the
Z
X
Lea
if
y;
.
h,
(x1)
training

examples
can
be
sampled
according
to
wt,
and
these
(unweighted)
resampled
examples
can
be used
to
Where
Z,
is
a
normalization
factor,
and
E
w,+1
(i)
=
1
train
the
base

learner.
Both
resampling
and
reweighting
can
be
used
to
train
AdaBoost.
4.
Output:
the
final
classifier:
Once
the
base
classifier
ht
has
been
received,
H(x)
=
sign[T
a,h
(x)]
AdaBoost

chooses
a
parameter
at
that
intuitively
t=1
measures
the
importance
that
is
assigned
to
ht
.
For
The
most
basic
theoretical
property
of
AdaBoost
binary
ht
.
asinteorignaldecriptinofAconcerns
its
ability

to
reduce
the
training
error,
i.e.,
the
bivenary
hre,dasn
theporigina
dcption
oA
o
fraction
of
mistakes
on
the
training
set.
Let
the
error
et
given
by
Freund
and
Schapire[4],
typically

set
a
1-8t
of
ht
be
et
=1/2-
yt,
Yt
measures
how
much
better
2t
=
-In(
'
(2)
than
random
(which
has
an
error
rate
of
1/2)
are
h,

's
Note
that
at
>
0
if
et
<1/2
,
and
that
a,
gets
larger
classifications.
Freund
and
Schapire[4]
prove
that
the
training
error
(the
fraction
of
mistakes
on
the

training
set)
as
et
gets
smaller.
of
the
final
hypothesis
H
is
at
most:
The
distribution
w,
is
then
updated
using
the
rule
T T
T
t
f
I
~~~~J7[
2

Ve,(1-e,)]
II-
4
yt2
.
exp(-2>,yt)
(3)
shown
in
the
algorithm.
The
effect
of
this
rule
is
to
t=,
t=,
t=1
increase
the
weight
of
examples
misclassified
by
ht,
and

Thus,
if
each
base
hypothesis
is
slightly
better
than
to
decrease
the
weight
of
correctly
classified
examples.
random
so
that
Yt
2
y
for
some
y
>
0,
then
the

training
Thus,
the
weight
tends
to
concentrate
on
"hard"
error
drops
exponentially
fast.
examples.
Previous boosting
algorithms
required
that
such
a
The
final
or
combined
classifier
H
is
a
weighted
lower

bound
y be
known
a
priori
before
boosting
begins.
majority
vote
of
the
T
base
classifiers
where
at
is
the
In
practice,
knowledge
of
such
a
bound
is
very
difficult
weight

that
assigned
to
ht
.
to
obtain.
AdaBoost,
on
the
other
hand,
is
adaptive
in
that
The
algorithm
for
AdaBoost
is
given
below:
it
adapts
to
the
error
rates
of

the
individual
base
1.
Input:
a
set
of
training
samples
with
labels
hypotheses.
This
is
the
basis
of
its
name
-
"Ada"
is
short
D=
{(xl,y1),
.,(xYn)}
,
xi
,

Y
=
1-1,+1.
for
"adaptive".
Base
learner
algorithm,
the
number
of
cycles
T.
2.
Initialize:
the
weight
of
samples:
w1
(i)
=
1
/
n,
for
alli
=l,*,
n
949

3.
AN
IMPROVED
ALGORITHM
FOR
In
order
to
avoid
the
problem
resulting
from
using
a
ADABOOST
WITH
SVM
BASE
single
and
fixed
a
to
all
RBFSVM
base
classifiers,
and
CLASSIFIERS

get
good
classification
performance,
it
is
necessary
to
find
suitable
a
for
each
base
classifier
RBFSVM.
Because
if
a
roughly
suitable
C
is
given
and
the
variance
Diversity
is
known

to
be
an
important
factor
affecting
of
the
training
samples
is
used
as
the
Gaussian
width
a
of
the
generalization
performance
of
ensemble
methods
RBFSVM,
the
SVM
may
get
comparable

good
[11][12],
which
means
that
the
errors
made
by
different
classification
performance,
we
will
use
the
variance
of
base
classifiers
are
uncorrelated.
If
each
base
classifier
is
the
training
samples

for
each
base
classifier
as
the
moderately
accurate
and
these
base
classifiers
disagree
Gaussian
width
a
of
RBFSVM
in
this
paper,
this
will
with
each
other,
the
uncorrelated
errors
of

these
base
generate
a
set
of
moderately
accurate
RBFSVM
classifiers
will
be
removed
by
the
voting
process
so
as
to
classifiers
for
AdaBoost,
and
an
improved
algorithm
will
achieve
good

ensemble
results[13].
This
also
applies
to
be
obtained,
we
call
it
the
variable
o-
AdaBoost.
AdaBoostRBFSVM.
Studies
that
using
SVM
as
the
base
learner
of
In
the
proposed
AdaBoostSVM,
the

obtained
SVM
AdaBoost
have
been
reported[7][8][9][10].
These
studies
base
learners
are
mostly
moderately
accurate,
which
give
showed
the
good
generalization
performance
of
chances
to
obtain
more
un-correlated
base
learners.
AdaBoost.

Through
adjusting
the
a
value
according
to
the
variance
For
AdaBoost,
it
is
known
that
there
exists
a
dilemma
of
the
training
samples,
a
set
of
SVM
base
learners
with

between
base
learner's
accuracy
and
diversity[14],
which
different
learning
abilities
is
obtained.
The
proposed
means
that
the
more
accurate
two
base
learners
become,
variable
G-AdaBoostRBFSVM
is
hoped
to
achieve
the

less
they
can
disagree
with
each
other.
Therefore,
higher
generalization
performance
than
AdaBoostSVM
how
to
select
SVM
base
learners
for
AdaBoost?
Select
which
using
a
single
and
fixed
a
to

all
RBFSVM
base
accurate
but
not
diverse
base
learners?
Or
select
diverse
classifiers.
In
the
proposed
algorithm,
without
loss
of
but
not
too
accurate
ones?
generality,
re-sampling
technique
is
used.

Suppose
we
can
keep
an
accuracy
and
diversity
The
algorithm
for
variable
a
-AdaBoostRBFSVM:
balance
among
different
base
classifiers,
the
superior
1.
Input:
a
set
of
training
samples
with
labels

result
of
AdaBoost
will
be
gotten,
but
there
is
no
D=
{(x,
,
yI
),
-
,
(x,
yJ
)},
xi
E
X,
y1
E
Y
=
{-1,+1}.
effective
way

to
get
a
desirable
result
for
AdaBoost.
We
will
then
analyze
the
instance
of
using
RBFSVM
as
the
Bs
hlerer
the
number
of
cycles
T.
base
classifier
of
AdaBoost.
2.

Initialize:
the
weight
of
samples:
w1
(i)
=
1/n,
for
The
problem
of
model
selection
is
very
important
for
all
i
=
1, ,
n
SVM,
the
classification
performance
of
SVM

is
affected
3.
Do
for
t
=
1
'T
by
its
parameters.
For
RBFSVM,
they
are
the
Gaussian
(1)
A
subset
of
the
training
examples
can
be
width
a,
and

the
regularization
parameter
C.
The
variation
of
either
of
them
leads
to
the
change
of
sampled
accrdn
tonwtitand
these
classification
performance.
However,
as
reported
in
[7]
resampled
examples
constitute
the

new
although
RBFSVM
cannot
learn
well
when
a
very
low
training
data
set
d1,
dt
will
be
used
to
train
value
of
C
is
used,
its
performance
largely
depends
on

the
base
classifier
Ct
.
The
decision
function
the
a
value
if
a
roughly
suitable
C
is
given.
of
C
is
h
How
to
set
the
a
value
for
the

base
learners
when
t
using
the
RBFSVM
as
base
learner
for
AdaBoost?
(2)
Calculate
the
variance
a
of
dt:
Problems
are
encountered
when
applying
a
single
and
a=sqrt(mean(var(dt
))).
fixed

a
to
all
RBFSVM
base
learners.
In
detail[10],
an
(3)
Using
dt
as
the
training
sample
set,
a
as
over-large
a
often
results
in
too
weak
RBFSVM.
Its
classification
accuracy

is
often
less
than
5000
and
cannot
G
t
meet
the
requirement
on
a
base
learner
given
in
the
RBFSVM
with
Gaussian
width
(,
and
AdaBoost;
On
the
other
hand,

a
smaller
a
often
makes
ht
is
the
decision
function
of
Ct;
the
RBFSVM
stronger
and
boosting
them
may
become
(4)
Calculate
the
training
error
et
of
Ct:
inefficient
because

the
errors
of
these
base
learners
are
n
highly
correlated.
Furthermore,
too
small
a
can
even
Ct
=wt
(i),
Yi
#ht(xi);
make
RBFSVM
overfit
the
training
samples
and
they
also

cannot
be
used
as
base
learners.
Hence,
finding
a
suitable
a
for
AdaBoost
with
SVM
base
learners
becomes
a
problem[10].
950
(5)
Set
weight
of
base
classifier
Ct
number
of

training
samples,
and
axis
Y
gives
the
correct
1
1-
classification
rates.
In
Fig.1,
Ada-SVM
stands
for
at
=-ln(
t);
AdaBoostSVM,
and
improved
Ada-SVM
stands
for
2
Et
variable
a-AdaBoostRBF

SVM.
(6)
Update
training
samples'
weights:
90
-
w,
(i)
exp
{-ar,yih,
(xi
)}
wt+1
W
8
Where
Z,
is
a
normalization
factor,
and
5/+mprvdAaV
427
w,+1
(i)
=
1;

*
4
i
rn
10
0
2i
0O
2ai
300
3;iO
400
4ai
ai0Z
t+
n
~~~~~~~~~~~~~~~~~~NuberfThaig
Saples
4.Otu:tefia
lsiir
Fig.
1.
Performance
comparison
for
Westontoynonliner
H(x)
=
seign[
ah

(xh)].
t
61
4.
EXPERIMENTS
AND
RESU
LTS
.S
t
Ada~~~~~~~~~~~~~~~~~~~~~~~~-
SVM
Impove
Ada-SV
To
evaluate
the
performance
of
the
variable
ci-
79
~Ad-V
AdaBoostRBFSVM,
and
make
a
experimental
7

J100
1i
comparison
between
AdaBoostSVM
which
using
a
single
Fig.2
Performbance
compailes
Fig.
2.
Performance
comparison
for
Wetnooiner
and
fixed
a
to
all
RBFSVM
base
classifiers
and
our
improved
algorithm,

experiments
for
the
Westontoynon-
For
the
Wine
data
set,
the
training
and
testing
samples
liner
data
set
and
the
Wine
data
set[8]
were
conducted,
were
also
chosen
randomly
from
the

given
datasets,
and
the
results
of
the
classification
experiments
are
50,80,100,130,150
are
the
numbers
of
training
samples
given.
used
in
the
experiments,
and
79
is
the
number
of
testing
The

SVM
we
used
is
from
Steve
Gunn
SVM
Toolbox.
samples
used
in
the
experiments.
The
Westontoynonliner
data
set
consists
of
1000
samples
For
SVM
and
AdaBoostSVM,
set
the
Gaussian
width

of
2
classes,
each
sample
having
52
attributes.
The
Wine
ai
of
RBFSVM
to
2,
6,
and
12,
the
average
correct
data
set
consists
of
178
samples
of
3
classes,

each
sample
classification
rates
for
randomly
chosen
testing
data
sets
having
13
attributes,
class
1
is
used
as
the
positive
class
are
calculated.
and
the
other
two
classes
belong
to

the
negative
class
in
For
variable
ci-AdaBoostRBFSVM
and
AdaBoost-
the
classification
experiments.
SVM,
1/2-1/8
of
the
training
samples
are
used
to
train
the
The
training
of
SVMs
for
the
variable

ci-
base
classifiers,
and
the
average
correct
classification
AdaBoostRBFSVM,
AdaBoostSVM
and
SVM
are
under
rates
for
3
randomly
chosen
testing
data
sets
are
the
same
parameter
when
comparing
the
performance

of
calculated.
the
algorithms,
C=
1000.
For
SVM
and
AdaBoostSVM,
Fig.2
gives
the
results
of
performance
comparison
for
set
the
Gaussian
width
ai
of
RBFSVM
to
12.
Let
T
be

the
the
Wine
data
set,
axis
X
indicates
the
number
of
training
number
of
base
classifiers
and
T=10
in
the
experiments,
samples,
and
axis
Y
gives
the
correct
classification
rates.

For
the
Westontoynonliner
data
set,
the
training
and
In
Fig.2,
Ada-SVM
stands
for
AdaBoostSVM,
and
testing
samples
are
chosen
randomly
from
the
given
improved
Ada-SVM
stands
for
variable
ci-AdaBoost-
datasets,

50,150,200,300,500
are
the
numbers
of
training
RB3FSVM.
samples
used
in
the
experiments,
and
128
is
the
number
From
Fig.
1
and
Fig.2
we
can
know
that
comparing
of
testing
samples

used
in
the
experiments.
AdaBoostSVM
with
a
single
SVM,
they
have
almost
the
For
variable
ci-AdaBoostRBFSVM
and
AdaBoost-
same
classification
performances,
but
our
improved
SVM,
1/2-1/10
of
the
training
samples

are
used
to
train
AdaBoostRBFSVM
improves
the
average
correct
the
base
classifiers,
and
the
average
correct
classification
classification
rates
obviously.
rates
for
3
randomly
chosen
testing
data
sets
are
For

the
Wine
data
set,
the
distribution
of
the
training
calculated.
samples
is
unbalanced,
because
there
are
59
training
Fig.
1
gives
the
results
of
performance
comparison
for
samples
in
the

positive
class
and
119
training
samples
in
the
Westontoynonliner
data
set,
axis
X
indicates
the
the
negative
class.
From
Fig.2
we
can
know
that
the
951
variable
a-AdaBoostRBFSVM
is
more

efficient
for
[4]
Y.
Freund,
R.
E.
Schapire,
"A
decision-theoretic
unbalanced
data
set.
generalization
of
online
learning
and
an
application
to
Compared
with
using
a
single
SVM,
the
benefit
of

boosting",
Journal
of
Computer
and
System
Sciences,
using
the
improved
AdaBoostRBFSVM
is
its
advantage
vol.
55,no.
1,
pp.119-139,
August
1997.
of
model
selection;
and
compared
with
using
AdaBoost
[5]
R.

E.
Schapire,
Y.
Singer,
P.
Bartlett,
and
W.
Lee,
of
a
single
and
fixed
a
to
all
RBFSVM
base
classifiers,
it
"Boosting
the
margin:
A
new
explanation
for
the
has

better
generation
performance.
effectiveness
of
voting
methods,"
The
Annals
of
Statistics,
vol.
26,
no.
5,
pp.
1651-1686,
1998.
[6]
Vladimir
Vapnik,
Statistical
Learning
Theory,
John
5.
CONCLUSIONS
Wiley
and
Sons

Inc.,
New
York,
1998.
[7]
G.
Valentini,
T.
G.
Dietterich,
"Bias-variance
AdaBoost
is
a
general
method
for
improving
the
analysis
of
support
vector
machines
for
the
accuracy
of
any
given

learning
algorithm.
After
development
of
svm-based
ensemble
methods",
analyzing
the
relation
between
the
performance
of
Journal
of
Machine
Learning
Research,
vol.
5,
pp.
AdaBoost
and
the
performance
of
base
classifiers,

the
725-775,
2004.
approach
of
improving
the
classification
performance
of
[8]
Dmitry
Pavlov,
Jianchang
Mao,
"Scaling-up
Support
AdaBoostSVM
was
studied
in
this
paper.
Vector
Machines
Using
Boosting
Algorithm",
In
There

is
inconsistency
existed
between
the
accuracy
Proceedings
of
ICPR
2000.
and
diversity
of
base
classifiers,
and
the
inconsistency
[9]
Hyun-Chul
Kim,
Shaoning
Pang,
Hong-Mo
Je,
affect
generalization
performance
of
the

algorithm.
How
Daijin
Kim,
and
Sung
Yang
Bang,
"Constructing
to
deal
with
the
dilemma
between
base
classifier's
support
vector
machine
ensemble,"
Pattern
accuracy
and
diversity
of
AdaBoost
is
very
important

for
Recognition,
vol.
36,
no.
12,
pp.
2757-2767,
Dec
improving
the
performance
of
AdaBoost.
A
new
variable
2003.
a-AdaBoostSVM
was
proposed
by
adjusting
the
kernel
[10]Xuchun
Li,
Lei
Wang,
Eric

Sung,
"A
Study
of
function
parameter
of
the
base
classifier
based
on
the
AdaBoost
with
SVM
Based
Weak
Learners",
In
distribution
of
training
samples.
The
proposed
algorithm
Proceedings
of
IJCNN

2005.
improves
the
classification
performance
by
making
a
[11]
P.
Melville,
R.
J.
Mooney,
"Creating
diversity
in
balance
between
the
accuracy
and
diversity
of
base
ensembles
using
articial
data",
Information

Fusion,
classifiers.
Experimental
results
for
the
benchmark
vol.
6,
no.
1,pp.
99-111,
Mar
2005.
dataset
indicate
the
effectiveness
of
the
proposed
[12]
I.
K.
Ludmila,
J.
W.
Christopher,
"Measures
of

algorithm.
The
experimental
results
also
indicate
that
the
diversity
in
classifier
ensembles
and
their
relationship
proposed
algorithm
is
more
efficient
for
unbalanced
data
with
the
ensemble
accuracy",
Machine
Learning,
vol.

set.
51,
no.
2,pp.
181-207,
May
2003.
[13]
H.
W.
Shin
and
S.
Y.
Sohn,
"Selected
tree
classifier
combination
based
on
both
accuracy
and
error
diversity,"
Pattern
Recognition,
vol.
38,

pp.
191-197,
Acknowledgements
2005.
[14]
Thomas
G.
Dietterich,
"An
experimental
This
work
is
supported
by
Natural
Science
Basic
comparison
of
three
methods
for
constructing
Research
Plan
in
Shaanxi
Province
of

China
under
Grant
ensembles
of
decision
trees:
Bagging,
boosting,
and
2004F36,
and
partially
supported
by
NSFC
under
Grant
randomization,"
Machine
Learning,
vol.
40,
no.
2,
50505051.
pp.
139-157,
Aug
2000.

References
[1]
L.
G.
Valiant,
"A
theory
of
the
learnable",
Communications
of
the
ACM,
vol.
27,
no.
11,
pp.1134-1142,
November
1984.
[2]
R.
E.
Schapire,
"The
strength
of
weak
learnability",

Machine
Learning,
vol.
5,
no.
2,
pp.
197-
227,
1990.
[3]
Yoav
Freund,
"Boosting
a
weak
learning
algorithm
by
majority",
Information
and
Computation,
vol.
121,
no.
2,
pp.256-285,
1995.
952

×