Tải bản đầy đủ (.pdf) (31 trang)

(TIỂU LUẬN) course name and code basic econometrics – ECON 1313 lecturer name dr greeni maheshwari class group no 1

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1.87 MB, 31 trang )

Course Name and Code: Basic Econometrics – ECON 1313
Lecturer Name: Dr. Greeni Maheshwari
Class Group No. 1

Chocolate brand assigned:
RIO

Tran Hoang An
S3768191


Contents
Part 1: Descriptive statistics..................................................................................................................2
Part 2: Model Analysis..........................................................................................................................3
a)

Estimate regression model........................................................................................................3

b)

Seasonal index..........................................................................................................................6

c)

Impact of Disease on the sales of Rio chocolate......................................................................7

d)

Impact of competitor’s pricing..................................................................................................8

Part 3: Conclusion...............................................................................................................................22


Part 4: References...............................................................................................................................25
Part 5: Appendix.................................................................................................................................26

1


Part 1: Descriptive statistics
a. Line chart

Figure 1: Rio’s sales volume (1998-2018)
b. Visual analysis
Based on Figure 1, the sales volume of Rio generally follows an upward trend. There seems to be no
cycle appearing in the trend. However, it is clear that there is seasonality effect on the sales volume.
Within a year, Rio sales volume is the highest towards the end of the year (around December), while
the sales remain low in the middle of the year (around August and September). This pattern repeats
every year. Besides, there was an irregular fluctuation happening in 2008 when Whooping Cow
disease took effect. The sales volume of Rio in this year was much higher as compared to other
periods. Therefore, there was a certain spike in the sales volume in 2008 before it fell back and
continued with the increasing trend from 2009 onwards.

c. Descriptive statistics
2


Table 1: Descriptive statistics of Rio’s sales volume
According to Table 1, the mean sales volume is 11.0495 tonnes, which is quite closed to the median,
while the maximum and minimum sales volumes are 13.23 tonnes and 9.47 tonnes respectively. This
indicates the small spread in sales volume. In Figure 1, it is obvious that the sales volume of 13
tonnes happened in 2008 when there was a Cow disease. This is closed to the maximum sales over
the past 20 years. Such a high sales volume in 2008 was due to Cow disease that created low

demand for dairy products. Since Rio is not made from milk, there was an increase in demand for Rio
chocolate, making it sales higher. Standard deviation is small (0.736 tonnes), which indicates that
there is not much variance in the sales volume.

Part 2: Model Analysis
a) Estimate regression model


Linear trend

Table 2: SPSS output (linear trend)

Table 3: SPSS model summary (linear trend)

3


As shown in Table 2, the estimated regression of linear trend is
Volume (Yt^) = 10.145 + 0.007*t



Quadratic trend

To create quadratic trend, another variable, which is t^2, must be created. Using SPSS, the following
result is obtained

Table 4: SPSS output (quadratic trend)

Table 5: SPSS model summary (quadratic trend)


Based on Table 4, the estimated regression model is
Volume (Yt^) = 9.981 + 0.011*t – 0.00001531*t^2



Exponential trend

To determine exponential trend, the dependent variable should be log (Yt) instead of Yt. Therefore,
new variable, log (Yt) is created. Using SPSS, we obtain the following regression result

4


Table 6: SPSS output (exponential trend)

Table 7: SPSS model summary (exponential trend)

Hence, the estimated regression is log (Yt)^ = 2.318 + 0.001*t
Out of the 3 models, linear trend is the most suitable regression to depict the change in total sales
volume over time. This is because based on the line graph (Figure 1) and the visual analysis in Part 1,
there seems to be an upward linear trend in sales volume. The dependent variable (sales volume)
seems to increase at a constant rate. Moreover, by conducting hypothesis testing upon the
coefficient of ‘t’ in the linear regression model Yt^=B0 + B1*t,
H0: B1=0 (there is no linear trend)
H1: B1 ≠ 0 (there is linear trend)
As shown in Table 2, the p-value is 0.000 which is less than 0.05 (significant level). Hence, we reject
H0 and conclude that, at 95% confidence level, there is a linear trend in the sales volume of Rio
chocolate.
Besides, the quadratic and exponential trends are not likely to be fitted in this case because under

quadratic model, there is an upward trend. Yet the rate of increasing of the trend gets smaller over
time. After a maximum point, a downward trend follows. Looking at Figure 1, the rate of change
seems to be constant rather than diminishing. Therefore, quadratic model can be omitted. The same
reasoning can be applied to exponential model as well. Under exponential trend, the rate of change

5


will increase over time. Such a characteristic is not observed in Figure 1. Therefore, exponential
model is not fitted. As a result, the linear trend model is the most suitable depiction of the case.
This is our final regression model
Yt^ = 10.145 + 0.007*t

b) Seasonal index
Using excel, we obtain the SI as shown below

Table 8: Excel calculation of SI values

Figure 2: SI value in 12 months
6


Now we will incorporate the SI element into our regression model. The below table is the SPSS
output after we include SI.

Table 9: SPSS output for model with SI

Based on Table 9, estimated regression model is Yt^= -0.647 + 0.007*t + 0.108*SI
To determine whether SI has an impact upon sales volume, a hypothesis test is done upon B2 (the
coefficient of SI)

H0: B2=0 (there is no seasonality effect on sales volume)
H1: B2≠0 (there is seasonality effect on sales volume)
Based on Table 9, p-value of B2 is 0.000 which is less than 0.05 (significant level). Hence, we reject H0
and conclude, at 95% confidence level, that there is an effect of seasonality upon sales volume.

c) Impact of Disease on the sales of Rio chocolate

Table 10: SPSS output of model with Disease dummy variable

Based on Table 10, estimated model is Yt^= -0.696 + 0.007*t + 0.108*SI + 1.037*D
Hypothesis testing upon B3 (coefficient of disease)
7


H0: B3=0 (disease has no impact on Rio sales volume)
H1: B3≠0 (disease has an impact on Rio sales volume)
The p-value of B3 is 0.000 which is less than 0.05 (significant level). Hence, we reject H0 and
conclude, at 95% confidence level, that there is a relationship between disease and Rio sales volume.
Therefore, Disease dummy variable is a significant variable. Based on the value of B3 (1.037), there is
a positive relationship between Disease and Rio sales volume.

d) Impact of competitor’s pricing

Table 11: SPSS output of model including competitors’ prices

In Table 11, Pt represents the price of Rio chocolate while Pt_c1,2,3,…,8 represent price of CS,
Gummi, Smartey, Heaven, Milkey, Treat, Lovely and Roca respectively.
Based on Table 11, our regression model is
Yt^= -0.864 + 0.007*t + 0.102*SI + 1.043*D – 0.242*Pt + 0.150*Pt_c1 + 0.262*Pt_c2 – 0.023*Pt_c3 +
0.039*Pt_c4 + 0.054*Pt_c5 + 0.035*Pt_c6 + 0.026*Pt_c7 – 0.029*Pt_c8


The definition of variables is summarised in the table below,
Types of variable

Symbols

Coefficient

Definition
8


Dependent variable

Yt

Sales volume of Rio chocolate (in tonnes)

Independent variables

t

B1

Time in months (t=1 represent Jan 98)

SI
D

B2

B3

Seasonal index
Presence of disease (1: disease presence, 0:

Pt

B4

otherwise)
Price per 100g ($) of Rio chocolate

Pt_c1

B5

Price per 100g ($) of Caramel Squared

Pt_c2

B6

chocolate
Price per 100g ($) of Gummi chocolate

Pt_c3

B7

Price per 100g ($) of Smartey chocolate


Pt_c4

B8

Price per 100g ($) of Heaven chocolate

Pt_c5

B9

Price per 100g ($) of Milkey chocolate

Pt_c6

B10

Price per 100g ($) of Treat chocolate

Pt_c7

B11

Price per 100g ($) of Lovely chocolate

Pt_c8

B12

Price per 100g ($) of Roca chocolate


Table 12: Definition of variables

Looking at the coefficients in Table 11, there seems to be a negative relationship between Rio price
and its sales volume. Similarly, there is a negative relationship between price of Smartey/Roca and
Rio sales volume. For the remaining competitors, there is a positive relationship between their price
and Rio sales volume. Now, we will conduct hypothesis testing to check significance of the variables.
Hypothesis testing on B1, B2, B3, B4, B5 and B6
H0: B1=0 (no linear trend)
H1: B1≠0 (there is linear trend)
The p-value is 0.00 (Table 11) which is less than 0.05 (significant level). Hence, we reject H0 and
conclude at 95% confidence level that there is a relationship between t and Rio sales volume (there
is linear trend). Therefore, t is a significant variable.

H0: B2=0 (no seasonality effect on Rio sales volume)
9


H1: B2≠0 (there is seasonality effect on Rio sales volume)
The p-value is 0.000 (Table 11) which is less than 0.05 (significant level). Hence, we reject H0 and
conclude at 95% confidence level that there is seasonality effect on Rio sales. Therefore, SI is a
significant variable.

H0: B3=0 (no relationship between disease and Rio sales volume)
H1: B3≠0 (there is a relationship between disease and Rio sales volume)
The p-value is 0.000 (Table 11) which is less than 0.05 (significant level). Hence, we reject H0 and
conclude at 95% confidence level that there is a relationship between Disease and Rio sales volume.
Therefore, Disease is a significant variable.

H0: B4=0 (price of Rio chocolate has no effect upon its sales volume)

H1: B4≠0 (there is a relationship between Rio price and its sales volume)
The p-value is 0.001 (Table 11) which is less than 0.05 (significant level). Hence, we reject H0 and
conclude at 95% confidence level that there is a relationship between price of Rio chocolate and its
sales volume. Therefore, Pt is a significant variable.

H0: B5=0 (no relationship between caramel squared price and Rio sales volume)
H1: B5≠0 (there is relationship between caramel squared price and Rio sales volume)
The p-value is 0.037 (Table 11) which is less than 0.05 (significant level). Hence, we reject H0 and
conclude at 95% confidence level that there is a relationship between price of Caramel squared
chocolate and Rio sales volume. Therefore, Pt_c1 is a significant variable

H0: B6=0 (no relationship between price of Gummi chocolate and Rio sales volume)
H1: B6≠0 (there is relationship between price of Gummi chocolate and Rio sales volume)
The p-value is 0.000 (Table 11) which is less than 0.05 (significant level). Hence, we reject H0 and
conclude at 95% confidence level that there is a relationship between price of Gummi chocolate and
Rio sales volume. Therefore, Pt_c2 is a significant variable
10


Hypothesis testing on B7,8,9,10,11,12
Let i denotes the chocolate brands number (Smartey:7, Heaven:8, Milkey:9, Treat:10, Lovely:11,
Roca:12) and Bi denotes coefficient (i=7,8,9,10,11,12)
H0: Bi=0 (no relationship between price of brand i chocolate and Rio sales volume)
H1: Bi≠0 (there is relationship between price of brand i chocolate and Rio sales volume)
Based on Table 11, p-values are all greater than 0.05 (significant level). Hence, we do not reject H0
and conclude that, at 95% confidence level, there is no relationship between price of brand i
chocolate and Rio sales volume. Therefore, variables Pt_c3,4,5,6,7,8 are all insignificant when tested
individually.
Before removing these variables from the model, we need to check whether these variables are
jointly significant. Hence, F-test is used in this case, the following tables depict unrestricted and

restricted models using SPSS.

Table 13: Output of Unrestricted model (all variables included)

11


Table 14: Output of restricted model (Pt_c3,4,5,6,7,8 variables are restricted)

Hypothesis testing using F-test
-

H0: B7=B8=B9=B10=B11=B12=0 (Pt_c3,4,5,6,7,8 are not jointly significant)

-

H1: at least one Bi ≠ 0 (i=7,8,9,10,11,12) (at least one out of Pt_c3,4,5,6,7,8 is significant
under joint testing)

Based on SSR values found in Table 13 and 14,

(24.040−23.610 )/ 6
= 0.725
23.610 /( 252−12−1)

-

F-stat=

-


F-critical value (q, n-k-1) = +/-2.137

Since F-stat < F-critical, we do not reject H0. Hence, we conclude at 95% confidence level that the
variables Pt_c3,4,5,6,7,8 are not jointly significant. There is no relationship between these variables
and the dependent variable.
Hence, such variables can be removed from the regression model. The following table displays result
of final regression model after removing insignificant variables.

12


Table 15: SPSS output of final model
Based on Table 15, the estimated regression is
Yt^= -0.388 + 0.007*t + 0.103*SI + 1.052*D - 0.245*Pt + 0.143*Pt_c1 + 0.251*Pt_c2

Hypothesis testing on these coefficients (refers to Table 12 for definition of variables and symbols
used)
-

H0: Bi=0 (i=1,2,3,4,5,6) (there is no relationship between t/SI/Disease/Pt/Pt_c1/Pt_c2 and
Rio sales volume)

-

H1: Bi≠0 (i=1,2,3,4,5,6) (there is relationship between t/SI/Disease/Pt/Pt_c1/Pt_c2 and Rio
sales volume)
Table 15 shows that the p-values of all coefficients are less than 0.05 (significant level).
Hence, we reject Ho and conclude that at 95% confidence level, there is relationship
between t/SI/Disease/Pt/Pt_c1/Pt_c2 and sales volume of Rio. This means these variables

are significant.

Alternatively, we also use backward elimination method via SPSS and obtain the similar result (Table
16). Therefore, it can be concluded that t, SI, Disease, Pt, Pt_c1 and Pt_c2 are significant variables in
the model.

13


Table 16: Backward
elimination result

14




Interpretation of coefficients

-0.388 indicates that the sales volume of Rio chocolate is -0.388 tonnes when all independent
variables are 0. This interpretation does not make sense because sales volume cannot be negative.
Moreover, price of other chocolate brands cannot be zero.
0.007 means that after every 1 month, the sales volume of Rio chocolate is expected to increase by
0.007 tonnes, holding other variables constant.
1.052 means that the sales volume of Rio chocolate in the month of cow disease (in 2008) is
expected to be 1.052 tonnes higher than the sales volume in the month without disease, holding
other variables constant. This makes sense because the cow disease discouraged people from
consuming dairy products. Hence, the demand for non-dairy chocolate would be higher. Since Rio
chocolate is made without milk, there is an increase in demand for Rio chocolate, making its sales
volume higher, ceteris paribus.

-0.245 means that when the price of Rio chocolate increases by 1$, the sales volume of Rio
decreases by 0.245 tonnes, holding other variables constant. This makes sense because the higher
the price, the smaller the quantity demanded (law of demand), therefore people would purchase
smaller amount of Rio chocolate when its price increases, ceteris paribus.
0.143 means that when price of Caramel Squared chocolate increases by 1$, the sales volume of Rio
chocolate increases by 0.143 tonnes, holding other variables constant.
Similarly, 0.251 means that when price of Gummi chocolate increases by 1$, the sales volume of Rio
chocolate increases by 0.251 tonnes, holding other variables constant.
These interpretations make sense because Caramel Squared and Gummi are Rio’s competitors.
Hence, the increase in their price relative to Rio’s chocolate may draw their customers away, who
may purchase Rio now because Rio becomes relatively cheaper. This boosts Rio’s sales volume.
To interpret coefficients of SI, we incorporate the dummy variables representing months. Based on
Table 8, the SI value in December is the highest. So let’s denote Dec as base category. Dummy
variables Jan, Feb, Mar, Apr, May, June, July, Aug, Sep, Oct, and Nov are used (Jan=1 if that month is
January and 0 if otherwise). Using SPSS, we obtain the following result

15


Table 17: Output with ‘Months’ dummy variables

Now, let’s interpret the coefficients of dummy variables.
-1.019 means that the sales volume of Rio in January is expected to be 1.019 tonnes lower than sales
volume in December, holding other variables constant.
-0.929 means that the sales volume of Rio in February is expected to be 0.929 tonnes lower than
sales volume in December, holding other variables constant.
-0.926 means that the sales volume of Rio in March is expected to be 0.926 tonnes lower than sales
volume in December, holding other variables constant.
-0.809 means that the sales volume of Rio in April is expected to be 0.809 tonnes lower than sales
volume in December, holding other variables constant.

-1.000 means that the sales volume of Rio in May is expected to be 1.000 tonnes lower than sales
volume in December, holding other variables constant.

16


-1.176 means that the sales volume of Rio in June is expected to be 1.176 tonnes lower than sales
volume in December, holding other variables constant.
-1.127 means that the sales volume of Rio in July is expected to be 1.127 tonnes lower than sales
volume in December, holding other variables constant.
-1.284 means that the sales volume of Rio in August is expected to be 1.284 tonnes lower than sales
volume in December, holding other variables constant.
-1.217 means that the sales volume of Rio in September is expected to be 1.217 tonnes lower than
sales volume in December, holding other variables constant.
-1.151 means that the sales volume of Rio in October is expected to be 1.151 tonnes lower than sales
volume in December, holding other variables constant.
-0.943 means that the sales volume of Rio in November is expected to be 0.943 tonnes lower than
sales volume in December, holding other variables constant.



Strength of the model

Hypothesis testing upon B1,2,3,4,5,6
H0: Bj=0, j=1,2,3,4,5,6 (there is no relationship between t/SI/Disease/Pt/Pt_c1/Pt_c2 and Rio sales
volume)
H1: Bj≠0, j=1,2,3,4,5,6 (there is a relationship between t/SI/Disease/Pt/Pt_c1/Pt_c2 and Rio sales
volume)
Based on Table 15, the p-values of all coefficients are less than 0.05 (significant level). Hence, we
reject H0 and conclude at 95% confidence level that there is a relationship between each of these

variables and the sales volume of Rio. So these variables are significant.

Table 18: Model summary

17


Moreover, the adjusted R-squared is 0.819 which is a high value (Table 18). This indicates that 81.9%
of the changes in the sales volume of Rio chocolate can be explained by the variations in the
independent variables such as t, SI, Disease, Pt, Pt_c1 and Pt_c2. Such a high value of R-squared
makes this model strong.



Residual analysis

Figure 3: Residual plot

Figure 4: Histogram of residuals

18


Figure 3 shows that residuals are distributed evenly instead of following any specific pattern.
Therefore, visual analysis indicates no autocorrelation. This means that the errors are not correlated
over time. To double check for autocorrelation, we regress Ut on U(t-1) using SPSS and obtain the
following result

Table 19: SPSS output of residual model


Based on Table 19, the regression model of residual is Ut=-0.014*U(t-1) + et
Hypothesis testing upon p (coefficient of U(t-1))
H0: p=0 (there is no autocorrelation)
H1: p≠0 (there is autocorrelation)
Based on Table 19, p-value of p is 0.823, which is higher than 0.05 (significant level). Hence, we do
not reject H0. Therefore, we can conclude that, at 95% confidence level, there is no autocorrelation.
In the case whereby autocorrelation exists, we need to create AR(1) model. This means that we
include another regressor which is Y(t-1). Our initial model is Yt^= -0.388 + 0.007*t + 0.103*SI +
1.052*D - 0.245*Pt + 0.143*Pt_c1 + 0.251*Pt_c2.
Therefore, we will regress Yt on variables t, SI, Disease, Pt, Pt_c1, Pt_c2 and Y(t-1). After that, we will
perform the same autocorrelation test as above. If under hypothesis testing, there is autocorrelation
again, then we need to choose 1 more variable and create the lagged variable. For example, we can
choose Pt and create 1 more variable which is Pt(t-1). Then we regress Yt on variables t, SI, Disease,
Pt, Pt_c1, Pt_c2, Y(t-1) and Pt(t-1). After that, we perform the same autocorrelation test as before.
We will keep doing the test and add in variables if necessary until there is no autocorrelation. This is
how we can overcome autocorrelation.
Now, to check normality assumption, we look at Figure 4. In Figure 4, it appears that residuals are not
following Normal distribution. Hence, the assumption of normality is violated. The implication of this
is that when errors are not normally distributed, the dependent variable and OLS estimators will not

19


follow normal distribution as well. This means that the usual F-test and t-test are not valid anymore.
In other words, the significance testing may not be correct in identifying the significant variables.
Therefore, some variables may appear significant under t-test may not be actual significant variables.



Checking assumptions


-

1st assumption: linear in parameters

In other words, the relationship between regressors and regressand must be linear. This means the
power of parameters must be 1. In our model, we assume linear relationship between independent
variables and dependent variable. This is evident that in our regression function, all the OLS
estimators have the power of 1. In our model, we are using Bi instead of Bi-squared (i=1,2,3,4,5,6).
Hence, this assumption is not violated.

-

2nd assumption: no perfect collinearity

This means the relationship between regressors must not be perfect linearity. In other words, 1
explanatory variable cannot be expressed as a linear equation of other variables. To test
multicollinearity, Variance Inflation Factor (VIF) values are used. If the VIF values of the regressors are
smaller than 10, there is no multicollinearity among them. Using SPSS, we obtained the result as
shown in Table 20

Table 20: VIF values of the estimators.

It is clear that all VIF values are in the range 1-2. Therefore, the assumption of no perfect collinearity
in our model is not violated.

20


-


3rd assumption: zero conditional mean.

This means the mean value of the errors (unobserved factors) is unrelated to the values of the
independent variables in all periods. In our case, this assumption may not hold because unobserved
factors may be influenced by values of independent variables. Some unobserved factors are ‘strength
of brand name’ and ‘people’s taste and preference’ can affect Rio sales volume. These factors can be
related to regressors. For example, when there was Cow disease in the past, people developed
preference for Rio (non-milk products). Hence, even until now, people still have preference for Rio
due to what happened in the past. Consumers may be afraid that the same disease may happen
again, so they are willing to buy Rio. Moreover, the Disease strengthened Rio brand name since it is a
safe product even in the event of Cow disease. Hence, the strong brand name today can be
attributed to the presence of Disease in the past. Therefore, it is evident that unobserved factors can
be related to the past values of regressors. Hence, 3 rd assumption may not hold. This means that the
OLS estimators may be biased and may not reflect the true population parameter.

-

4th assumption: homoskedasticity

This means the variance of errors must be constant regardless of any values of regressors. Based on
Figure 3, the values of residuals are distributed evenly, this indicates no heteroskedasticity. To double
check the assumption, we use Breusch Pagan test and compute residual squared. Then, we regress
this upon independent variables. The result is shown below

Table 21: Breusch Pagan test result

Ui^2 = S0 +S1*t +S2*D +S3*SI + S4*Pt +S5*Pt_c1 + S6*Pt_c2 + error
Hypothesis testing
-


H0: S1=S2=S3=S4=S5=S6=0 (there is no heteroskedasticity)

-

H1: at least 1 out of 6 is different from 0 (there is heteroskedasticity)
21


Based on Table 21, the p-value is 0.493 which is more than 0.05 (significant level). Hence, we do not
reject H0. Therefore, we conclude that at 95% confidence level, there is no heteroskedasticity. This
indicates that the homoskedasticity assumption is not violated.
-

5th assumption: no autocorrelation

-

6th assumption: normality of errors

These 2 assumptions are already tested in the Residual analysis part.



Recommendation of other variables to be included

Income of the consumers can affect the sales volume of Rio chocolate. Hsu et al. (2002) found that
with higher income, consumers have greater purchasing power which spur them to increase
spending on food and snacks products. Similarly, Majumdar (2004) argued that rising income can
raise the sales volumes of food retail industry. Therefore, there may be a positive relationship

between ‘household income’ and ‘sales volume of Rio’. We can include this variable in our model to
test the relationship.
Another variable that can be included is the ‘number of Rio’s advertisement’. Bruce et al. (2012)
found the positive relationship between advertising and growth in sales. Chakrabortty et al. (2013)
attributed such a relationship to how advertising affects consumer mindset and behaviours. In fact,
effective advertising helps increase people’s awareness of the brand, hence it can attract potential
customers. In another study, Buil et al. (2013) argued that advertisement also helps the firms retain
its competitiveness by constantly reminding customers of their product quality and features. This
builds brand name and eventually increase sales volume. Therefore, we can incorporate variable ‘the
number of Rio’s advertisements’ in our model. There is probably a positive relationship between Rio
advertisement and its sales volume.

Part 3: Conclusion
Overall, there is an increasing trend in the sales volume of Rio chocolate over the past 20 years.
Based on Table 8, out of 12 months, April, November and December are the 3 months with higherthan-average sales of chocolate (SI>100) while the other months experience lower-than-average
sales (SI<100). One reason for this is because in April, people highly consume chocolate and buy

22


chocolate as gifts during Easter period (Tannenbaum 2012). In November and December, the period
is closed to Christmas, hence the sales volume tends to spike (Johnson 2007). Therefore, in the
periods which are associated with major holidays, sales volume of chocolate tends to go up. In
contrast, low sales occur in August because this period is usually associated with hot weather. Many
people believe that it is too hot to eat chocolate in this month, so they switch to ice cream instead of
chocolate (Smillie 2011). Therefore, exceptionally low sales happen in August every year.
Besides, the whooping cow disease in 2008 strongly affected the sales volumes of chocolate brands.
This disease caused people to avoid consumption of dairy products. Hence, the chocolate brands
such as Rio, Caramel Square and Gummi are resistant to the disease because these brands’ chocolate
does not contain milk. Appendices 1, 2 and 3 show that the disease in 2008 significantly raised the

sales volume of these 3 brands. This is true because people switched to non-dairy chocolate brands,
resulting in a spike in the sales volume of these brands. In contrast, the other 6 brands Smartey,
Heaven, Milkey, Treat, Lovely and Roca are not resistant to disease because these brands’ chocolate
contains milk, from fair amount to heavy amount. Since people avoid consumption of dairy products,
the sales volume of these brands dropped in 2008. The higher the milk content in chocolates, the
larger the fall in sales volume. Appendices 4, 5 and 6 show that in 2008 the sales volume of Heaven
and Milkey reduced from approximately 8 tonnes to 6 tonnes (25%) with Smartey’s sales dropped
from 9 tonnes to 6 tonnes (33%). With brands containing higher milk content, the fall in sales was
larger. Appendices 7, 8 and 9 show that in 2008, the sales volume of Treat, Lovely and Roca dropped
by more than 50%. This was a significant amount since these 3 brands chocolate contains higher
amount of milk.
In part 2, we found that there is a positive relationship between ‘t’ and sales volume of Rio and such
relationship is valid under significant testing. This positive relationship indicates the upward linear
trend in sales volume, which is consistent with our visual analysis done in Part 1. Besides, our
econometric analysis indicates that seasonality is a significant factor influencing Rio sales volume. By
calculating the average and compare with the monthly sales each year, we found that the sales in
certain months (April, November and December) are higher than the others. This pattern repeats
every year, which is consistent with our graphical observation in Part 1. We also discovered that
there is a positive relationship between Disease and Rio sales volume. Under significant testing, this
relationship is valid. Therefore, the presence of disease causes Rio sales volume to be higher. This
explains the spike in Rio sales volume in 2008 when there was Cow disease (Figure 1). Hence, the
econometric findings are consistent with the graphical analysis.

23


Regarding the impact of price on Rio sales volume, we found that the higher the Rio price, the
smaller its sales volume. In part 2, it is shown that under hypothesis testing, Rio price is a significant
factor affecting Rio sales volume and the relationship between them is negative. In contrast, there is
a positive relationship between competitors’ price and Rio sales volume (except for Smartey and

Roca). Under significant testing, the prices of Caramel Square and Gummi are significant factors
affecting Rio sales volume while prices of other competitors are not. This means only Caramel
Squared and Gummi are relevant competitors. This is probably because Rio, Caramel Squared and
Gummi sell similar products, which are non-dairy chocolates. Hence, these are considered direct
competitors and their products are perfect substitutes of one another. Therefore, the increase in
price of Caramel Squared or Gummi will chase customers away, resulting in an increase in the sales
volume of Rio, which is relatively cheaper. In contrast, other brands sell chocolate that contains milk,
which is slightly different from Rio. Hence, these brands are not Rio’s direct competitors, which
explains their insignificant influence upon Rio’s sales volume.
Additional advice:
Since there is a large demand for chocolate in December, Rio can think about having a huge stock to
prepare itself for December sales. With huge deals, Rio can negotiate with suppliers for lower
material cost. This helps Rio enjoy lower unit production cost, which can boost its profit.
Besides, this report only provides econometric analysis of some factors affecting Rio sales volume.
These factors are largely external. To increase sales, it is important for Rio to look at internal factors
as well, which are the quality of chocolate produced and their variety. To build a good brand name
and customer loyalty, the product quality is still the most vital factor. Therefore, Rio should focus on
R&D to further improve its product quality and to come up with more variety to cater to different
customers. As such, it can increase its sales volume in the long run.

24


×