ECE 307 – Techniques for Engineering
Decisions
Using Data
George Gross
Department of Electrical and Computer Engineering
University of Illinois at Urbana-Champaign
© 2006 – 2009 George Gross, University of Illinois at Urbana-Champaign, All Rights Reserved.
1
FOCUS
Use of historical data to obtain probability
distributions
The interpretation of probability information
Use of estimators
Application example
© 2006 – 2009 George Gross, University of Illinois at Urbana-Champaign, All Rights Reserved.
2
EXAMPLE
Consider the interpretation of the statement
P { sunny day in June in Champaign} = 0.53
June weather patterns in Champaign for the past
20 years are collected and every day is classified
as either sunny or not sunny
600 days of June data are available with 318 or
53% of these days classified as sunny
Given the long – term historical behavior, the
probability of 0.53 makes sense
© 2006 – 2009 George Gross, University of Illinois at Urbana-Champaign, All Rights Reserved.
3
relative frequency (days/3650)
USE OF HISTOGRAMS
rated capacity
0 outage
high
derated
capacity
low
derated
capacity
full outage
capacity
outage capacity of a generating plant (MW )
© 2006 – 2009 George Gross, University of Illinois at Urbana-Champaign, All Rights Reserved.
4
CONSTRUCTION OF THE c.d.f.
1.0
P{ X ≤ a } = p
p
a
© 2006 – 2009 George Gross, University of Illinois at Urbana-Champaign, All Rights Reserved.
x
5
STATISTICAL PARAMETER
ESTIMATORS
Estimator of the mean
n
∑x
mean of the
i
i=1
distribution
x =
n
Estimator of the variance
n
∑( x
s
2
i
− x)
2
variance of the
i=1
=
n −1
distribution
© 2006 – 2009 George Gross, University of Illinois at Urbana-Champaign, All Rights Reserved.
6
STATISTICAL PARAMETER
ESTIMATORS
{
We use a set of random samples x 1 , x 2 , . . ., x n
}
of a r.v. X : these are n randomly picked values
from the sample space of X
The estimator x computed with the set of random
samples provides an estimate of
μ = E {X}
The estimator s 2 computed with the set of random
samples provides an estimate of
σ 2 = var { X }
© 2006 – 2009 George Gross, University of Illinois at Urbana-Champaign, All Rights Reserved.
7
EXAMPLE: TACO SHELLS
This application example focuses on taco shells
and is concerned with the high breakage rate in
the shipment of most taco shells: typical rate is
10 – 15 %
A company with a new shipping container claims
to have a lower, approximately 5 % breakage rate
This company’s price is $ 25 for a 500 – taco shell
box vs. $ 23.75 for a 500 – taco shell box of the
current supplier
© 2006 – 2009 George Gross, University of Illinois at Urbana-Champaign, All Rights Reserved.
8
EXAMPLE: TACO SHELLS
A test run using 12 boxes from the new company
and 18 boxes from the current company is
performed and used for comparison purposes: in
other words, we pick randomly
{x
1
, x 2 , . . . , x 12
}
from the sample space of the r.v. X describing the
{
}
new company shells and y 1 , y 2 , . . . , y 18 from the
sample space of the r.v. Y describing the current
company shells
The data of the useable shells from the two
suppliers are tabulated
© 2006 – 2009 George Gross, University of Illinois at Urbana-Champaign, All Rights Reserved.
9
EXAMPLE: TACO SHELLS
useable shells
new supplier
current supplier
468
467
444
441
450
474
469
449
434
444
474
484
443
427
433
479
470
440
446
441
482
463
439
452
436
478
468
448
442
429
© 2006 – 2009 George Gross, University of Illinois at Urbana-Champaign, All Rights Reserved.
10
EXAMPLE: TACO SHELLS
r
e
i
pl
p
u
e
s
s
a
new 5.00/c
$2
cur
ren
$ 23 t sup
plie
.75
/cas
r
e
costs per
unbroken shell
i
ii
number of unbroken
shells (x)
i
ii
number of unbroken 23.75
shells (y)
y
© 2006 – 2009 George Gross, University of Illinois at Urbana-Champaign, All Rights Reserved.
25
x
11
c.d.f.s CONSTRUCTED FOR THE TWO
SUPPLIERS
1
0.9
0.8
current supplier
0.7
0.6
0.5
0.4
0.3
0.2
441
0.1
0
450
460
430
440
420
new supplier
473
470
unbroken shells per box
480
© 2006 – 2009 George Gross, University of Illinois at Urbana-Champaign, All Rights Reserved.
490
12
c.d.f.s OF THE TWO SUPPLIERS
Clearly, the new supplier has the higher expected
number of useable shells per box; the two
distributions, however, are highly similar
The mean number of useable shells for the new
supplier is 473 and so the expected costs per
© 2006 – 2009 George Gross, University of Illinois at Urbana-Champaign, All Rights Reserved.
13
c.d.f.s OF THE TWO SUPPLIERS
useable shell is $0.0529; the minimum (maximum)
number of useable shells is 463(482)
The mean number of useable shells for the
current supplier is 441 and so the expected costs
per useable shell is $0.0539; the minimum
(maximum) number of useable shells is 429(452)
© 2006 – 2009 George Gross, University of Illinois at Urbana-Champaign, All Rights Reserved.
14
EXAMPLE: TACO SHELLS
number of usable shells cost per usable
shell ($)
462
0.185
0.0541
er
i
l
p
p
su /box
w 00
e
n 25.
$
cu
rr e
nt
$23 sup
.75
/bo plier
x
472
0.630
0.0530
485
0.185
0.0515
427
0.185
0.0556
442
0.630
0.0537
452
0.185
0.0525
© 2006 – 2009 George Gross, University of Illinois at Urbana-Champaign, All Rights Reserved.
15
COMMENTS
We use the c.d.f.s to estimate the means of the
two populations of suppliers
Typically, the function
−1
⎧1⎫
E ⎨ ⎬ ≠ ⎡⎣ E { X }⎤⎦
⎩X ⎭
© 2006 – 2009 George Gross, University of Illinois at Urbana-Champaign, All Rights Reserved.
16
COMMENTS
and so we cannot use the approximation
⎧ 25 ⎫
25
E⎨ ⎬≈
⎩ X ⎭ E {X}
This example demonstrates the usefulness of the
c.d.f.s in applications even when they can only be
approximated for the available data
© 2006 – 2009 George Gross, University of Illinois at Urbana-Champaign, All Rights Reserved.
17