Tải bản đầy đủ (.pdf) (69 trang)

Giao trinh bai tap dsp 28102015

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (2.74 MB, 69 trang )

Discrete Probability with
R
Nguyen An Khuong,
Huynh Tuong Nguyen

Chapter 7
Discrete Probability with R
Discrete Structures for Computer Science (CO1007) on
December 7th, 2015

Contents
Randomness
Sampling with R
Probability
Probability Rules
Probability with R
Discrete RVs
Some Discrete
Probability Models
Geometric Model
Binomial Model

The built-in
distributions in R
Densities
Cdf
Quantiles
Random numbers

Nguyen An Khuong, Huynh Tuong Nguyen
Faculty of Computer Science and Engineering


University of Technology, VNU-HCM

References

7.1


Contents
1 Randomness

Discrete Probability with
R
Nguyen An Khuong,
Huynh Tuong Nguyen

2 Random sampling with R
3 Probability
4 Probability Rules

Contents
Randomness
Sampling with R

5 Probability calculations and combinatorics with R
6 Discrete Random variables
7 Some Discrete Probability Models

Geometric Model
Binomial Model
8 The built-in distributions in R


Densities
Cumulative distribution functions
Quantiles
Random numbers

Probability
Probability Rules
Probability with R
Discrete RVs
Some Discrete
Probability Models
Geometric Model
Binomial Model

The built-in
distributions in R
Densities
Cdf
Quantiles
Random numbers

References

9 References and Further Reading
7.2


Motivations


Discrete Probability with
R
Nguyen An Khuong,
Huynh Tuong Nguyen

• Gambling
Contents
Randomness
Sampling with R
Probability
Probability Rules
Probability with R

• Real life problems

Discrete RVs
Some Discrete
Probability Models
Geometric Model
Binomial Model

The built-in
distributions in R

• Computer Science: cryptology, coding theory, algorithmic

complexity,...

Densities
Cdf

Quantiles
Random numbers

References

7.3


Randomness

Discrete Probability with
R
Nguyen An Khuong,
Huynh Tuong Nguyen

Which of these are random phenomena?
• The number you receive when rolling a fair dice
• The sequence for lottery special prize (by law!)
• Your blood type (No!)
• You met the red light on the way to school
• The traffic light is not random. It has timer.
• The pattern of your riding is random.

So what is special about randomness?
In the long run, they are predictable and have relative frequency
(fraction of times that the event occurs over and over and over).

Contents
Randomness
Sampling with R

Probability
Probability Rules
Probability with R
Discrete RVs
Some Discrete
Probability Models
Geometric Model
Binomial Model

The built-in
distributions in R
Densities
Cdf
Quantiles
Random numbers

References

7.4


Randomness in Statistics

Discrete Probability with
R
Nguyen An Khuong,
Huynh Tuong Nguyen

• Randomness and probability: central to statistics.
• Empirical fact: Most experiments and investigations are not


perfectly reproducible.
• The degree of irreproducibility may vary:

Contents
Randomness
Sampling with R
Probability

• Some experiments in physics may yield data that are accurate

Probability Rules

to many decimal places,
• whereas data on biological systems are typically much less
reliable.

Probability with R

• View of data as something coming from a statistical

distribution: vital to understanding statistical methods.
• We outline the basic ideas of probability and the functions
that R has for random sampling and handling of theoretical
distributions.

Discrete RVs
Some Discrete
Probability Models
Geometric Model

Binomial Model

The built-in
distributions in R
Densities
Cdf
Quantiles
Random numbers

References

7.5


Random Numbers with R

Discrete Probability with
R
Nguyen An Khuong,
Huynh Tuong Nguyen

• Much of the earliest work in probability theory was about

games and gambling issues, based on symmetry
considerations.
• The basic notion then is that of a random sample: dealing

from a well-shuffled pack of cards or picking numbered balls
from a well-stirred urn.
• In R, we can simulate these situations with the sample


function.
• If we want to pick five numbers at random from the set

1 : 40, then you can write
> sample(1:40,5)
[1] 4 30 28 40 13

Contents
Randomness
Sampling with R
Probability
Probability Rules
Probability with R
Discrete RVs
Some Discrete
Probability Models
Geometric Model
Binomial Model

The built-in
distributions in R
Densities
Cdf
Quantiles
Random numbers

References

7.6



Sample function

Discrete Probability with
R
Nguyen An Khuong,
Huynh Tuong Nguyen

• The first argument (x) is a vector of values to be sampled
• The second (size) is the sample size.
• Actually, sample(40, 5) would suffice since a single number is

interpreted to represent the length of a sequence of integers.
• Notice that the default behavior of sample is sampling

without replacement.
• That is, the samples will not contain the same number twice,

and size obviously cannot be bigger than the length of the
vector to be sampled.
• If we want sampling with replacement, then we need to add

the argument replace = TRUE.

Contents
Randomness
Sampling with R
Probability
Probability Rules

Probability with R
Discrete RVs
Some Discrete
Probability Models
Geometric Model
Binomial Model

The built-in
distributions in R
Densities
Cdf
Quantiles
Random numbers

References

7.7


Sampling with replacement

Discrete Probability with
R
Nguyen An Khuong,
Huynh Tuong Nguyen

Contents

• Sampling with replacement is suitable for modelling coin


tosses or throws of a die.
• So, for instance, to simulate 10 coin tosses we could write

> sample(c("H","T"), 10, replace=T)
[1] "T" "T" "T" "T" "T" "H" "H" "T" "H" "T"
• In fair coin-tossing, the probability of heads should equal the

probability of tails, but the idea of a random event is not
restricted to symmetric cases.

Randomness
Sampling with R
Probability
Probability Rules
Probability with R
Discrete RVs
Some Discrete
Probability Models
Geometric Model
Binomial Model

The built-in
distributions in R
Densities
Cdf
Quantiles
Random numbers

References


7.8


Data with nonequal probabilities

Discrete Probability with
R
Nguyen An Khuong,
Huynh Tuong Nguyen

Contents

• You can simulate data with nonequal probabilities for the

outcomes (say, a 90% chance of success) by using the prob
argument to sample, as in
> sample(c("succ", "fail"), 10, replace=T,
prob=c(0.9, 0.1))

Randomness
Sampling with R
Probability
Probability Rules
Probability with R
Discrete RVs

[1] "succ" "succ" "succ" "succ" "succ"
"fail" "succ" "succ" "succ" "fail"
• This may not be the best way to generate such a sample,


though. See the later discussion of the binomial distribution.

Some Discrete
Probability Models
Geometric Model
Binomial Model

The built-in
distributions in R
Densities
Cdf
Quantiles
Random numbers

References

7.9


Terminology

Discrete Probability with
R
Nguyen An Khuong,
Huynh Tuong Nguyen

• Experiment/trial (thí

nghiệm (ngẫu nhiên)/phép
thử ): a procedure that yields

one of a given set of possible
outcomes randomly.
• Tossing a coin to see the

face
• Rolling a die
• ...
• Sample space (không gian mẫu, Ω): set of all possible

outcomes
• {Head, Tail}
• {1, 2, 3, 4, 5, 6}

Contents
Randomness
Sampling with R
Probability
Probability Rules
Probability with R
Discrete RVs
Some Discrete
Probability Models
Geometric Model
Binomial Model

The built-in
distributions in R
Densities
Cdf
Quantiles


• Event (sự kiện): a subset of sample space.
• You see Head after an experiment. {Head} is an event.
• {1, 3, 5}

Random numbers

References

7.10


Example

Discrete Probability with
R
Nguyen An Khuong,
Huynh Tuong Nguyen

Example

Experiment: Rolling two dice. What is the sample space?
Answer: It depends on what we’re going to ask!

Contents
Randomness
Sampling with R
Probability
Probability Rules


• The total number?

{2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}
• The number of each die?

{(1,1), (1,2), (1,3), . . ., (6,6)}

Probability with R
Discrete RVs
Some Discrete
Probability Models
Geometric Model
Binomial Model

Which is better?
The latter one, because they are equally likely outcomes

The built-in
distributions in R
Densities
Cdf
Quantiles
Random numbers

References

7.11


The Law of Large Numbers (LLN)


Discrete Probability with
R
Nguyen An Khuong,
Huynh Tuong Nguyen

Definition

Contents

The Law of Large Numbers (Luật số lớn) states that the long-run
relative frequency of repeated independent events gets closer and
closer to the true relative frequency as the number of trials
increases.

Randomness
Sampling with R
Probability
Probability Rules
Probability with R
Discrete RVs

Example

Do you believe that the true relative frequency of Head when you
toss a coin is 50%?
Let’s try!

Some Discrete
Probability Models

Geometric Model
Binomial Model

The built-in
distributions in R
Densities
Cdf
Quantiles
Random numbers

References

7.12


Be Careful!

Discrete Probability with
R
Nguyen An Khuong,
Huynh Tuong Nguyen

Don’t misunderstand the Law of Large Numbers (LLN). It can
lead to money lost and poor business decisions.

Contents
Randomness

Example


I had 8 children, all of them are girls. Thanks to LLN (!?), there
are high possibility that the next one will be a boy.
(Overpopulation!!!)

Sampling with R
Probability
Probability Rules
Probability with R
Discrete RVs
Some Discrete
Probability Models

Example

I’m playing Bầu cua tôm cá, the fish has not appeared in recent 5
games, it will be more likely to be fish next game. Thus, I bet all
my money in fish. (Sorry, you lose!)

Geometric Model
Binomial Model

The built-in
distributions in R
Densities
Cdf
Quantiles
Random numbers

References


7.13


Discrete Probability with
R

Probability

Nguyen An Khuong,
Huynh Tuong Nguyen

Definition

The probability (xác suất) of an event E of a finite nonempty
sample space of equally likely outcomes Ω is:

Contents
Randomness
Sampling with R

|E|
p(E) =
.
|S|

Probability
Probability Rules
Probability with R
Discrete RVs


• Note that E ⊆ Ω so 0 ≤ |E| ≤ |Ω|
• 0 ≤ p(E) ≤ 1
• 0 indicates impossibility
• 1 indicates certainty

Some Discrete
Probability Models
Geometric Model
Binomial Model

The built-in
distributions in R
Densities
Cdf
Quantiles

People often say: “It has a 20% probability”

Random numbers

References

7.14


Examples
Example (1)

Discrete Probability with
R

Nguyen An Khuong,
Huynh Tuong Nguyen

What is the probability of getting a Head when tossing a coin?
Answer:
• There are |Ω| = 2 possible outcomes
• Getting a Head is |E| = 1 outcome, so

p(E) = 1/2 = 0.5 = 50%

Contents
Randomness
Sampling with R
Probability
Probability Rules
Probability with R

Example (2)

What is the probability of getting a 7 by rolling two dice?

Discrete RVs
Some Discrete
Probability Models
Geometric Model
Binomial Model

Answer:
• Product rule: There are a total of 36 equally likely possible


outcomes
• There are six successful outcomes:

(1,6), (2,5), (3,4), (4,3), (5,2), (6,1)

The built-in
distributions in R
Densities
Cdf
Quantiles
Random numbers

References

• Thus, |E| = 6, |S| = 36, p(E) = 6/36 = 1/6
7.15


Examples

Discrete Probability with
R
Nguyen An Khuong,
Huynh Tuong Nguyen

Example (3)

We toss a coin 6 times. What is probability of H in 6th toss, if all
the previous 5 are T?
Answer:

Don’t be silly! Still 1/2.

Contents
Randomness
Sampling with R
Probability
Probability Rules
Probability with R

Example (4)

Which is more likely:
• Rolling an 8 when 2 dice are rolled?
• Rolling an 8 when 3 dice are rolled?

Discrete RVs
Some Discrete
Probability Models
Geometric Model
Binomial Model

The built-in
distributions in R
Densities

Answer:
Two dice: 5/36 ≈ 0.139
Three dice: 21/216 ≈ 0.097

Cdf

Quantiles
Random numbers

References

7.16


Discrete Probability with
R

Formal Probability

Nguyen An Khuong,
Huynh Tuong Nguyen

Rule 1

A probability is a number between 0 and 1.
0 ≤ p(E) ≤ 1

Contents
Randomness
Sampling with R
Probability

Rule 2: Something has to happen rule

The probability of the set of all possible outcomes of a trial must
be 1.

p(Ω) = 1.

Probability Rules
Probability with R
Discrete RVs
Some Discrete
Probability Models
Geometric Model
Binomial Model

Rule 3: Complement Rule

The probability of an event occurring is 1 minus the probability
that it doesn’t occur.

The built-in
distributions in R
Densities
Cdf
Quantiles
Random numbers

References

p(E) = 1 − p(E).
7.17


Example (Birthday Problem)


Given a group of n < 365 students. We’ll ignore leap years and
assume that all birthdays are equally likely.
i) If we pick a specific day (say December 7th), then what is the
chance that at least one student was born on that day?
ii) What is the probability that at least one student has the
same birthday as any other student?

Discrete Probability with
R
Nguyen An Khuong,
Huynh Tuong Nguyen

Contents
Randomness
Sampling with R
Probability

Answer i).

Probability Rules

• The sample space is the set of all 365n possible choices of

birthdays for n individuals.
• p1 (n) = P (At least one student was born on December 7th)

= 1 − P (No students were born on December 7th)
=1−

364n

365n .

Discrete RVs
Some Discrete
Probability Models
Geometric Model
Binomial Model

The built-in
distributions in R
Densities

• We have p1 (30) ≈ 7.9%, and p2 (91) ≈ 21.8%.
• In order for the probability of at least one other person to

share your birthday to exceed 50%, we need n large enough
that
p1 (n) = 1 −

Probability with R

364n
365n

> 0.5, or n > 253.

Cdf
Quantiles
Random numbers


References

7.18


Discrete Probability with
R

Birthday Problem (cont’d)

Nguyen An Khuong,
Huynh Tuong Nguyen

Answer ii).
• p2 (n) = P (At least 1 same birthday)

= 1 − P (No same birthdays)

Contents

365×364×···×(365−n+1)
.
365n

1−

Randomness

x


• Using a first-order approximation for e for x

=e

Discrete RVs

×e

Some Discrete
Probability Models

× · · · × (1 −

× ··· × e

n−1
365 )

− n−1
365

1+2+···+(n−1)
365

n(n−1)
− 2×365

=e

Probability


Probability with R

we have
365×364×···×(365−n+1)
365n
1
2
= (1 − 365
) × (1 − 365
)
1
2
− 365
− 365


Sampling with R

Probability Rules

ex ≈ 1 + x,

≈e

1:

Geometric Model
Binomial Model


The built-in
distributions in R
Densities
Cdf
Quantiles

.

• So p2 (n) ≈ 1 − e

Random numbers

n(n−1)
− 2×365

.

References

• For n = 23, p2 (23) ≈ 0.507. (Surprisingly!)
7.19


p2 values

Discrete Probability with
R
Nguyen An Khuong,
Huynh Tuong Nguyen


n
1
5
10
20
23
50
60
70
100
366

p2 (n)
0
2.7%
11.7%
41.1%
50.7%
97.0%
99.4%
99.9%
99.99997%
100%

Contents
Randomness
Sampling with R
Probability
Probability Rules
Probability with R

Discrete RVs
Some Discrete
Probability Models
Geometric Model
Binomial Model

The built-in
distributions in R
Densities
Cdf
Quantiles
Random numbers

References

7.20


Generalization and Variation of Birthday Problem
• More generally, suppose we have N objects, where N is large.









There are r people, and each chooses an object.

Then, similarly to above approximation,
r(r−1)
r2
p = P (there is a match) ≈ 1 − e− 2N ≈ 1 − e− 2N .
r2
≥ ln 2, or
Now if we want p ≥ 1/2, then we can choose 2N

r 1.177 N .

If there are N possibilities and we have a list of length N ,
then there is a good chance of a match: ≈ 40%.
If we want to increase the chance√of a match, we can make a
list of length of a constant times N .
As a variation, suppose there are N objects and there are
two groups of r people. Each person from each group selects
an object. What is the probability that someone from the first
group choose the same object as someone from the second
group?
r2
P (there is a match between two groups) = 1 − e− N . (Rather
difficult!)
Eg. If we take N = 365 and r = 30, then
2
P (there is a match between two groups) = 1 − e−30 /365 =
0.915.

Discrete Probability with
R
Nguyen An Khuong,

Huynh Tuong Nguyen

Contents
Randomness
Sampling with R
Probability
Probability Rules
Probability with R
Discrete RVs
Some Discrete
Probability Models
Geometric Model
Binomial Model

The built-in
distributions in R
Densities
Cdf
Quantiles
Random numbers

References

7.21


Discrete Probability with
R

A birthday attack on discrete logarithm


Nguyen An Khuong,
Huynh Tuong Nguyen

Contents

• We want to solve αx ≡ β (mod p).
• Make two lists, both of length around

Randomness



Sampling with R

p:

k

• 1st list: α (mod p) for random k.
• 2nd list: βα−h (mod p) for random h.

• There is a good chance that there is a match:

αk ≡ βα−h (mod p).
• Hence, x = h + k.

Probability
Probability Rules
Probability with R

Discrete RVs
Some Discrete
Probability Models
Geometric Model
Binomial Model

The built-in
distributions in R
Densities
Cdf
Quantiles
Random numbers

References

7.22


Formal Probability

Discrete Probability with
R
Nguyen An Khuong,
Huynh Tuong Nguyen

Contents

General Addition Rule

Randomness

Sampling with R
Probability

p(E1 ∪ E2 ) = p(E1 ) + p(E2 ) − p(E1 ∩ E2 )

Probability Rules
Probability with R
Discrete RVs

• If E1 ∩ E2 = ∅: They are disjoint, which means they can’t

occur together
• then, p(E1 ∪ E2 ) = p(E1 ) + p(E2 )

Some Discrete
Probability Models
Geometric Model
Binomial Model

The built-in
distributions in R
Densities
Cdf
Quantiles
Random numbers

References

7.23



Discrete Probability with
R

Example

Nguyen An Khuong,
Huynh Tuong Nguyen

Example (1)

If you choose a number between 1 and 100, what is the probability
that it is divisible by either 2 or 5?

Contents
Randomness
Sampling with R

Short Answer:
20
10
50
100 + 100 − 100 =

Probability

3
5

Probability Rules

Probability with R
Discrete RVs

Example (2)

There are a survey that about 45% of VN population has Type O
blood, 40% type A, 11% type B and the rest type AB. What is the
probability that a blood donor has Type A or Type B?
Short Answer:
40% + 11% = 51%

Some Discrete
Probability Models
Geometric Model
Binomial Model

The built-in
distributions in R
Densities
Cdf
Quantiles
Random numbers

References

7.24


Conditional Probability (Xác suất có điều kiện)


Discrete Probability with
R
Nguyen An Khuong,
Huynh Tuong Nguyen

• “Knowledge” changes probabilities
Contents
Randomness
Sampling with R
Probability
Probability Rules
Probability with R
Discrete RVs
Some Discrete
Probability Models
Geometric Model
Binomial Model

The built-in
distributions in R
Densities
Cdf
Quantiles
Random numbers

References

7.25



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Tải bản đầy đủ ngay
×