STRATEGIC FINANCIAL MANAGEMENT
BASIC STATISTICS
KHURAM RAZA
First Principle and Big Picture
Summarizing Data
The problem that we face today is not that we have too
little information but too much. Making sense of large
and often contradictory information is part of what we
are called upon to do when analyzing companies.
Data Distributions
Summary Statistics
Data Distributions
Frequency distribution.
Discrete distribution.
Continuous distribution.
you can summarize even the largest data sets into
one distribution and get a measure of
What values occur most frequently and
The range of high and low values.
Summary Statistics
The information that gives a quick and simple description of the
data.
Measures of Central Tendency
Mean
Minimum Height: 6.2
Quintiles
Average Height : 6.68
Measures of Dispersion
Maximum Height : 7.3
Variance
Average Change Per Day : 0.03
standard deviation
Relative Measures of Variation
Coefficient of Variation (CV)
Standardized Variable (Z-Score)
Mean
The mean is the average of the numbers: a calculated
"central" value of a set of numbers.
To calculate: Just add up all the numbers, then divide
by how many numbers there are.
Example: what is the mean of 2, 7 and 9?
Add the numbers: 2 + 7 + 9 = 18
Divide by how many numbers (i.e. we added 3
numbers):
18
÷
3
=
6
So the Mean is 6
n
X
i
X1 X 2 X n
X
i 1
n
n
Quintiles
For
individual
observations/discrete
frequency
distribution, the i th quartile, j th decile and k th
percentile are located in the array/discrete frequency
distribution by the following relations
Qi
i(n 1)
th observation in the distribution, i 1, 2, 3
4
j(n 1)
th observation in the distribution, j 1, 2, ,9
10
k(n 1)
Pk
th observation in the distribution, k 1, 2,,99
100
Dj
Variance & Standard deviation
The variance and the closely-related standard
deviation are measures of how spread out a
distribution is
variance measures the variability (volatility) from an
average or mean.
Variance
Standard Deviation
Variance & Standard deviation
Mr.
• X has eight eggs. Each egg was weighed and recorded as follows:
60 g, 56 g, 61 g, 68 g, 51 g, 53 g, 69 g, 54 g.
Mean = ∑X/n
472/8
=59
Variance = ∑(x- )2/n
320/8
= 40
S.D = √(x- )2/n
√40
= 6.32 gram
Comparing Standard Deviations
Data A
11
12
13
14
15
16
17
18
19
20 21
Mean = 15.5
S = 3.338
20 21
Mean = 15.5
S = 0.926
20 21
Mean = 15.5
S = 4.567
Data B
11
12
13
14
15
16
17
18
19
Data C
11
12
13
14
15
16
17
18
19
The smaller the standard deviation, the more tightly clustered the scores around
mean
The larger the standard deviation, the more spread out the scores from mean
02:57:50 PM
10
Coefficient of Variation (CV)
S
100%
CV
X
Can be used to compare two or more sets
of data measured in different units or
same units but different average size.
02:57:50 PM
11
Use of Coefficient of Variation
Stock A:
– Average price last year = $50
– Standard deviation = $5
S
$5
CVA 100%
100% 10%
$50
X
Stock B:
– Average price last year = $100
– Standard deviation = $5
S
$5
CVB 100%
100% 5%
$100
X
02:57:50 PM
Both stocks
have the
same
standard
deviation
but stock B is
less variable
relative to its
price
Standardized Variable
02:57:51 PM
13
Performance evaluation by z-scores
The industry in which sales rep Mr. Atif works has mean annual
sales=$2,500
standard deviation=$500.
The industry in which sales rep Mr. Asad works has mean annual
sales=$4,800
standard deviation=$600.
Last year Mr. Atif’s sales were $4,000 and Mr.
Asad’s sales were $6,000.
Which of the representatives would you hire if
you have one sales position to fill?
02:57:51 PM
Performance evaluation by z-scores
Sales rep. Atif
Sales rep. Asad
XB= $2,500
XP =$4,800
S= $500
SP = $600
XB= $4,000
XP= $6,000
ZB
XB XB
SB
ZB
4,000 2,500
500
ZP
3
XP XP
SP
ZP
6,000 4,800
600
Mr. Atif is the best choice
02:57:51 PM
2
Relationships in the Data
When there are two series of data, there are a number
of statistical measures that can be used to capture
how the two series move together over time.
10000
9000
Covariance
Correlations
Regressions
8000
7000
6000
Sales
COGS
Selling Exp
Admin Exp
5000
4000
3000
2000
1000
0
100
200
300
400
500
600
700
800
900
1000
Covariance
Covariance indicates how two variables are related. A positive covariance means
the variables are positively related, while a negative covariance means the
variables are inversely related. The formula for calculating covariance of sample
data is shown below.
The covariance between the returns
of the S&P 500 and economic growth is
1.53. Since the covariance is positive,
the variables are positively related—they
move together in the same direction.
Correlation
Correlation is another way to determine how two variables are related. In
addition to telling you whether variables are positively or inversely related,
correlation also tells you the degree to which the variables tend to move
together.
The correlation measurement, called a correlation coefficient, will always
take on a value between 1 and – 1:
If the correlation coefficient is one, the variables have a perfect positive correlation.
If correlation coefficient is zero, no relationship exists between the variables.
If correlation coefficient is –1, the variables are perfectly negatively correlated (or
inversely correlated).
Correlation
A correlation coefficient of .66 tells
you two important things:
Because the correlation coefficient is a positive number, returns on the
S&P 500 and economic growth are positively related.
Because .66 is relatively far from indicating no correlation, the strength
of the correlation between returns on the S&P 500 and economic
growth is strong.
Regressions
A regression uses the historical relationship between an
independent and a dependent variable to predict the future values
of the dependent variable. Businesses use regression to predict
such things as future sales, stock prices, currency exchange rates,
and productivity gains resulting from a training program.
Y=a+bX
Slope of the Regression
Intercept of the Regression
Regressions