Tải bản đầy đủ (.doc) (32 trang)

tiểu luận kinh tế lượng REPORT ON FACTORS ASSOCIATED WITH BODY MASS INDEX (BMI) IN VIETNAMESE ADOLESCENCE

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (259.63 KB, 32 trang )

FOREIGN TRADE UNIVERSITY
FACULTY OF INTERNATIONAL ECONOMICS

----------------------------------------------------------GROUP ASSIGNMENT – ECONOMETRICS

REPORT ON FACTORS ASSOCIATED
WITH BODY MASS INDEX (BMI) IN
VIETNAMESE ADOLESCENCE
Class: KTEE218.1
Group 9

Lecturer: Ms. Nguyen Thuy Quynh
Members: Phạm Nguyễn Xuân Lộc - 1816450047
Hoàng Ngọc Anh - 1814450010
Hoàng Hương Giang - 1814450026
Trần Thị Linh Chi - 1814450018
Nguyễn Thị Hạnh Hà - 1814450030

Hanoi 09/2019


I.

ABSTRACT

The BMI is a convenient rule of thumb used to broadly categorize a person as
underweight, normal weight, overweight, or obese based on tissue mass muscle, fat, bone and
height.
The BMI is generally used as a means of correlation between groups related by
general mass and can serve as a vague means of estimating adiposity. BMI is easy to use as a
general calculation. On the whole, the index is suitable for recognizing trends within


sedentary or overweight individuals because there is a small margin of error. The BMI has
been used by the WHO as the standard for recording obesity statistics since the early 1980s.
In this report we examine the Factors associated with Body Mass Index (BMI) in
Vietnamese Adolescents
The research was conducted on 152 people whose ages were ranging from 18 to 25.
The data were collected using a questionnaire form that consisted of questions concerning
general characteristics of individuals: Height, weight, average time spend on excercises…
The average BMI of the individuals differs according to each person’s routine and
how they consume calories.
The purpose of this report is to apply econometrics to examine the effect of different
factors on the BMI and thus finding ways to have a healthier life, prevent overweight and
underweight.

1


II.

CONTENTS

I.ABSTRACT
II.
CONTENTS

1
2

III.

LIST OF ABBREVATIONS


3

IV.

LIST OF TABLES

3

V.

LIST OF FIGURES

3

VI.

INTRODUCTION

4

VII.

CHAPTER I: RATIONALE OF THE STUDY

5

7.1 Basis for variables and model choosing

5


7.2 Variables:

5

7.2.1 Dependent variable: BMI

5

7.2.2 Independent variables

6

7.2.3 Model

6

7.3 Assess about BMI metric
VIII.

CHAPTER II: EMPIRICAL RESEARCH

6
8

8.1 Literature review

8

8.2 Objective


8

8.3 Quantitative analysis

8

8.4 Quanlitative analysis

9

8.4.1 Empirical model Multiple regression model

9

8.4.2 Methodology

10

8.4.3 Data sources

11

8.4.4 Expectations

11

8.4.5 Estimation results

12


IX.

CHAPTER III: HYPOTHESIS TESTING

17

9.1 P-value testing

17

9.2 Heteroscedasticity testing:

17

9.3 Correlation between variables testing

18

X.

RECOMMENDATIONS, DIFFICULTY AND LIMITATION OF THE STUDY:

19

10.1 Recommendation

19

10.2


20

Difficulty

10.3 Limitation

20

XI.

CONCLUSION

22

XII.

REFERENCES

23

XIII.

APPENDIX

24

2



III.

LIST OF ABBREVATIONS
-OLS: Ordinary Least Square regression
-BMI: Body Mass Index

IV. LIST OF TABLES
Table 1: The BMI evaluation
Table 2: Variables
Table 3: Explanation variables
Table 4: Table collected
Table 5: Summary of simple statistics for variables
Table 6: Tabulation of sex
Table 7: Regression of bmi ( dependent) and sleep meal income exercise sex ( independent)
Table 8: Vif
Table 9: Des sleep, bmi, meal, income, exercise, sex
Table 10: Imtest, white

6
9
24
24
28
28
28
29
29
29

V. LIST OF FIGURES


3


VI.

INTRODUCTION

BMI (Body Mass Index) is a simple, inexpensive, and noninvasive surrogate
measure of body fat. In contrast to other methods, BMI relies solely on height and weight and
with access to the proper equipment, individuals can have their BMI routinely measured and
calculated with reasonable accuracy. Furthermore, studies have shown that BMI levels
correlate with body fat and with future health risks. High BMI predicts future morbidity and
death. Therefore, BMI is an appropriate measure for screening for obesity and its health risks.
Lastly, the widespread and longstanding application of BMI contributes to its utility at the
population level. Its use has resulted in an increased availability of published population data
that allows public health professionals to make comparisons across time, regions, and
population subgroups.
Obesity is a medical condition that occurs when a person carries excess weight or
body fat that might affect their health. If a person does have obesity and excess weight, this
can increase their risk of developing a number of health conditions, including metabolic
syndrome, arthritis, and some types of cancer. Causes of Obesity varies from consuming too
many calories to leading a sedentary lifestyle. Recent hypotheses in the scientific community
suggest the current obesity epidemic is being driven largely by environmental factors (e.g.,
high energy/high fat foods, fast food consumption, television watching, "super-sized"
portions, etc). Vietnamese people are bombarded with images and offers of high fat, high
calorie, highly palatable, convenient, and inexpensive foods. These foods are packaged in
portion sizes that far exceed federal recommendations. Furthermore, the physical demands of
our society have changed resulting in an imbalance in energy intake and expenditure. Today's
stressful lifestyles compound the effects of environmental factors by impairing weight loss

efforts and by promoting fat storage, increasing urbanization and changing modes of
transportation, it is no wonder that obesity has rapidly increased in the last few decades,
around the world. To help ease this “epidemic” we conduct this report on Factors affecting
BMI.
There are lots of factors that have impact to the BMI such as: age, sex, physical
activities, individual’s income, number of calories consume per day, etc. However, how these
factors affect BMI and the extend of affection are still very ambiguous to most people. So to
clear the mist, our group has done a survey on this issue. Nevertheless, due to our
unexperience, we just focus on a a specific group of people. Our topic is: “Factors
associated with body mass index (BMI) in Vietnamese Adolescent”
We give our everything into this report, but surely, making mistakes is inevitable. We
hope that after reading our report, you can give us some feedback on how we can improve
the quality. Thank you!

4


VII.

CHAPTER I: RATIONALE OF THE STUDY

7.1 Basis for variables and model choosing


Based on the characteristics of BMI, any factor alone can not show whether a person's
weight is sensible or not, but using it in combination with other indicators can provide a
more complete picture. Therefore, we decided to choose 8 variables that are both
oriented and indefinitely affecting BMI for Vietnamese adolescence in general as:
1. Height
2. Weight

3. Gender
4. Personal income
5. Numbers of minutes spending on exercising
6. Number of sleeping hours per day
7. Packets of milk drinks per day
8. Total number of meals per day

● Multiple Regression Model is the model that we will use mainly in this report. In this
case, we want to examine the factors that affect to BMI of a Vietnamese so that we have
both dependent and independent variables. Based on the results of the independent
variables, we can predict the dependent variables (BMI).

7.2 Variables:

7.2.1 Dependent variable: BMI
BMI (Body Mass Index) is the body index used by doctors and health professionals to
determine whether a person's body is obese, overweight or too thin. Usually, people use to
calculate the level of obesity. The only downside of the BMI is that it cannot calculate the
amount of fat in the body - the potential risk factor for future health.

1

Your BMI is calculated as follows: BMI = (body weight) / (height x height)
- body weight: in kg;
- height x height: in m;

The BMI evaluation board follows World Health Organization (WHO) standards and is
specifically for Asians (IDI & WPRO). You can assess your own BMI through the statistics
table below:


1Wikipedia

5


Table 1: The

BMI evaluation
BMI
Below 18.5
18.5 – 24.9
25.0 – 29.9
30.0 and above

Weight status
Underweight
Normal
Overweight
Obese

7.2.2 Independent variables
● Height: the distance from the top to the bottom of something, or the quality of
being tall.
● Weight: the amount that something or someone weighs.
● Gender: the physical and/or social condition of being male or female.
● Personal income: money earned by a person over a particular period of time.
● Numbers of minutes spending on exercising: is any bodily activity that enhances or
maintains physical fitness and overall health and wellness.
● Number of sleeping hours per day: is a naturally recurring state of mind and body,
characterized by altered consciousness, relatively inhibited sensory activity, inhibition

of nearly all voluntary muscles, and reduced interactions with surroundings.
● Packets of milk drunk per day: a packet of milk is 180ml.
● Total number of meals per day: when food is eaten.

7.2.3 Model
Multiple regression model: Multiple regression model is a statistical method used to predict the
value a dependent variable based on the values of two or more independent variables.The
variable called the dependent variable sometimes can be the outcome, target or criterion
variable. The variables we are using to predict the value of the dependent variable, called the
independent variables sometimes can be the predictor, explanatory or regressor variables.

2

7.3 Assess about BMI metric

● BMI is a quick and simple way to get an overall view of your health: It is easy to use
formula and makes it useful for measuring across populations. With its simplistic design,
it can easily be applied to research that compares data on obesity rates between different
age ranges in geographical locations. The simplicity of calculating a

2statistics.laerd.com
6


BMI also makes it easy for anyone to quickly assess basic information about their
physical health at home without having to go to a medical professional or buy
expensive equipment.
● BMI works extremely well when used for what it’s designed for — to calculate in
measure obesity and weight across large populations. Because weight is not a
direct correlation to fat, and amount of fat on one’s body is not always directly

correlated to health issues, BMI measurements are more accurate when used to study
the rates of obesity and malnutrition among populations. When used in this way, BMI
can lead to productive conversations about health while still encouraging body
positivity and self-love.
● BMI is a widely used metric: Many people, including physicians, use BMI as a
measure of health and fitness. According to the National Heart, Lung, and
Blood Institute, it is a measure of body fat based on weight that applies to both
men and women.

7


VIII.

CHAPTER II: EMPIRICAL RESEARCH

8.1 Literature review
In 1972, Keys et al severely criticized the validity of Metropolitan Life Insurance
published data. Instead, Keys et al, using better documented weight for height data,
popularized the Quetelet Index in population-based studies. They referred to it as the body
mass index (BMI)
The distribution of BMIs in adult American men and women was determined in 1923 in
1026 individuals. The median BMI was 24, but the mean BMI was 25. The distribution curve
clearly indicated a skewing toward an increase in BMI, and this trend has continued.
The reason for choosing this topic: Our topic focuses on people in Vietnam with some
new variables: “average hours of exercise per day”, “income per day”, “meals per day”...
Similarities are we both using quantitative analysis methods and about factors affecting BMI.

8.2 Objective
Examine the effect of different sociodemographic factors on the BMI.


8.3 Quantitative analysis
Obesity is a widespreading disease and It is compared to be dangerous as an epidemic.
Obesity is when someone is so overweight that it is a threat to their health. People can become
obese in simples way like eating junk food or do not excercise. In our modern world with
increasingly cheap, high calorie food, prepared foods that can be found anywhere having high
percentage of sugar combined with our increasingly automatically lifestyles, increasing
urbanization and the development of automobiles, it is no wonder that obesity has rapidly
increased in the last few decades, around the world. A person who is overweight is at risk. He
has to face health problems such as heart disease, diabetes, and cancer.
In England in 2016, 34% of men and 46% of women had a very high waist
circumference. These proportions rose from 20% and 26% respectively in 1993 to 31% and 38%
in 2001. As with obesity, there were 617,000 admissions to NHS hospitals in 2016/17 where
obesity was recorded as either a primary or secondary diagnosis, an increase of 18 per cent on
2015/16 (525,000). Around two thirds of the admissions where obesity was recorded as either a
primary or secondary diagnosis in 2016/17 were for women (66 per cent)
Websites or mobile phone apps were used by 8% and activity trackers or fitness
monitors by 6%. Overall 47% of adults said they were trying to lose weight. 66 per cent of
men and 58 percent of women aged 19 and over met the government's aerobic guidelines in
2016. 21 percent of men and 25 percent of women were classed as inactive in 2016. 24 per
cent of men and 28 per cent of women consumed the recommended five portions of fruit and
vegetables a day in 2016. Half of the people who reported they were trying to lose weight

were not using any of the aids or support asked about.

3

3 Health Survey for England, 2016: Summary of key findings

8



8.4 Quanlitative analysis
8.4.1 Empirical model Multiple regression model
According to the basis of the BMI was devised by Adolphe Quetelet, a Belgian
astronomer, mathematician, statistician and sociologist, from 1830 to 1850 during which time
he developed what he called "social physics", The modern term "Body Mass Index" (BMI)
for the ratio of human body weight to squared height was coined in a paper published in the
July 1972 edition of the Journal of Chronic Diseases by Ancel Keys and others. We classified
the independent variables in two categories:
(1) individual factors: sex, height, weight, the number of meals per day, physical
activities, sleeping hours.
(2) family and social factors: income.
The dependent variable is BMI (Body Mass Index).
We entered all of the predictors (individual, family) into one model using a multiple
linear regression model. However, the BMI depends on some determinants: average hours of
exercise per day, income per day... As a result, we will set up to represent those disturbances.
The model is:
Y= X0 + X11+ X22+ X33+ X44+ X55 + ui

Table 2: Variables

NAME

TYPE

X

Independent
variable

(Quantitative
variable)

Sleep hours

sleep

Time (hours)

X

Independent
variable
(Quantitative
variable)

Meals per day

meal

Number of
meals

X

Independent
variable
(Quantitative
variable)


Income per month

income

VND

X

Independent
variable

Average minutes for doing
exercise per day

excercise

Time
(minutes)

1

2

3

4

EXPLANATION

SIGH


UNIT

/>9


(Quantitative
variable)
D

Y

1

Independent
variable
(Qualitative
variable)

Sex

sex

Dependent
variable

BMI

bmi


Males: 1;
females: 0

8.4.2 Methodology
The model was estimated by using Odinary Least Square regression approach which
used to predict outputs values for new samples. Some information given below will elaborate
on this method.
-Equations for the Ordinary Least Squares regression
Ordinary Least Squares regression (OLS) is more commonly named linear regression (simple
or multiple depending on the number of explanatory variables).
In the case of a model with p explanatory variables, the OLS regression model writes:
Y = β + Σ βX + ε
0

j=1..p

j

j

where Y is the dependent variable, β , is the intercept of the model, Xj corresponds to the jth
explanatory variable of the model (j=1 to p), and ε is the random error with expectation 0
and variance σ².
0

In the case where there are n observations, the estimation of the predicted value of
the dependent variable Y for the ith observation is given by:
y =β +Σ
i


0

j=1..p

βX
j

ij

The OLS method corresponds to minimizing the sum of square differences between the
observed and predicted values. This minimization leads to the following estimators of
the parameters of the model:
β = (X’Dx) X’ Dy σ² = 1/(W –p*) Σ
-1

i=1..n

w (y - y )
i

i

i

where β is the vector of the estimators of the βi parameters, X is the matrix of the
explanatory variables preceded by a vector of 1s, y is the vector of the n observed values
of the dependent variable, p* is the number of explanatory variables to which we add 1 if
the intercept is not fixed, wi is the weight of the ith observation, and W is the sum of the
wiweights, and D is a matrix with the wi weights on its diagonal.
The vector of the predicted values can be written as follows:


10


y = X (X’ Dx) X’Dy
-1

- Variable selection in the OLS regression
An automatic selection of the variables is performed if the user selects a too high number of
variables compared to the number of observations. The theoretical limit is n-1, as with greater
values the X’X matrix becomes non-invertible.
The deleting of some of the variables may however not be optimal: in some cases we might
not add a variable to the model because it is almost collinear to some other variables or to a
block of variables, but it might be that it would be more relevant to remove a variable that is
already in the model and to the new variable.
For that reason, and also in order to handle the cases where there a lot of explanatory
variables, other methods have been developed.

8.4.3 Data sources
In oder to estimate the model, we obtain data from a survey whose form consists of
some questions about height (metres), weight (kg), minutes per day for doing exercises or
working out, income (VND) including individuals’ salary and allownance from families,
sleeping hours, the number of meals per day, sex (female/male).

8.4.4 Expectations
First of all, we assumed that other factors would unchange when expect the mark
of another’s coefficients.
-The expectation of β0 is positive value because the independent variable Y is BMI
which is always greater than 0 if Xi=0 , Y will receive the value of β
-The expectation of β1 is positive value because of the reason that if a person spend

more time on sleeping rather than they need they will gain weight which will increase
BMI
-The expectation of β2 is negative value because if one person eat less meals per day
the metabolic system will malfunctioned so BMI will be greater than usual because of
gaining weight
-The expectation of β3 is positive value as we follow the assumption that
higher income, higher standard of living which helps people afford the fee for
gym or another sports.
-The expectation of β4 is positive value because if one spends more time on doing
exercise, it can help his/her high and lower weight which increase his/her BMI
0

-The expectation of β5 is positive value because we define the value of female label as 0; the
value of male label is equal to 1 and we expected that BMI of female would be higher than
male.

11


8.4.5 Estimation results
In order to conduct a multiple linear regression with the independent and dependent
variables given, the data should meet the Gauss-Markov Assumptions. The multiple
regression model is linear in parameters, is met since the ensuing multiple
regression is of the form:
y = β0 + β1X1 + β2X2 + β3X3 + … + u
We use sum command to determine Obs (Observations), Mean, Std.dev (Standard
Deviation), Max and Min of the variables.
Variables

Obs


Mean

sex

152

.4671053

height

152

1.630526

weight

152

bmi

Std. Dev.

Min

Max

0

1


.118354

1.45

1.85

62.14474

12.39601

40

82

152

23.8024

6.03558

12.541143

39.00119

excercise

152

1.253289


.8253972

0

2.5

income

152

2503289

1553139

0

5000000

meal

152

2.480263

1.139132

1

4


sleep

152

5.957237

1.519132

3

10

.5005661

From the table above it can be seen that:
-There are 152 observations.
-The height of the sample we had collected ranges from 1.45 to 1.85
with average value at 1.630526
-The weight of 152 observations ranges from 40 to 82
-The BMI ranges from 12.541143 to 39.00119 on average of 23.8024

We use tabulation command to find out the portion of male and female took part in
the survey:

12


sex


Freq.

Percent

Cum.

Female

81

53.29

53.29

Male

71

46.71

100.00

Total

152

100

As we can draw from the table, of the 152 people take part in the survey, there are:
-81 females, which account for 53.29 percent of the sample

-71 males, which take the percentage of 46.71.
We use regress command to estimate the coeffecients as below:
reg bmi sleep meal income exercise sex
Source

SS

Model

5451.89011

5

Residual

48.7724273

146

.334057721

Total

5500.66254

151

36.4282287

Number of obs


= 152

F(5, 146)

= 3264.04

Prob > F

= 0.0000

R-squared

= 0.9911

Adj R-squared

= 0.9908

Root MSE

= 0.57798

df

MS
1090.37802

13



bmi
sleep

Coef.
3.960645

Std. Err.
.0313438

t
126.36

P>|t|
0.0000

[95% Conf. Interval]
3.898699
4.02259

meal

-.0575852

.0425167

-1.35

0.178


-0.141613

.026442

income

2.17e-08

3.05e-08

0.71

0.478

-3.86e-08

8.2e-08

exercise

.0173715

.0581383

0.3

0.766

-.09753


.132273

sex

.0965581

.0965238

1.00

0.319

-.0942064

.028732

_cons

.2295313

.2371914

0.97

0.335

-.2392409

.698303


^

β 0 = 0.2295313
^

β 1 = 3.960645
^

β 2 = -0.0575852
^

β 3 = 2.17e-08
^

β 4 = 0.0173715
^

β 5 = 0.0965581


The sample regression function is written as:

Y = 0.2295313 + 3.960645X1 - 0.0575852X2 + 2.17e-08X3 + 0.0173715X4 - 0.0965581D1
This means that:
- When the sleeping hours, meal per day, income per month, and average minutes
for doing exercise equal to 0, then the BMI reach to the minimum point, at 12.86602.

- for every 1% increase in sleeping hours of a person, the model predicts that
their BMI increases by 3.96 percentage points.
- for every 1% increase in meal per day of a person, the model predicts

that their BMI increases by -0.57 percentage points.
- for every 1% increase in income per month of a person, the model predicts
that their BMI increases a trivial number.
- for every 1% increase in average minutes for doing exercise per day
the model predicts that their BMI increases 0.017371 percentage points.

14


Superficially analyzing figures:
-Number of observations: n = 152
-Total Sum of Squares: SST = 5500.66254
-Explained Sum of Squares: SSE = 5451.89011
-Residual Sum of Squares: SSR = 48.7724273
2
-Determination Coeffiecient (R-squared): R = 0.9911
2
-Adjusted R-squared: R = 0.9908
2

From the results above, we can see that R = 0.9911 which is a huge number. This
means 99.11% of the sample variation in IBM (dependent variable) is explained by
the changes in sleeping amount, meals per day, income per month, average minutes
2
for doing exercise per day and sex (independent variable). Even though R is very
large, we will still take deeper investigation in the following part.
Interval confidence
From the regression result table, we see
-The confidence interval of the intercept is [-0.2392409 ; 0.69803]
-The confidence interval for β1 is

3.898699 ≤ β1 ≤ 4.02259
That is, the confidence interval [3.898699 ;4.02259] includes the true β1 coefficient
with 95% percent of confidence coefficient. Thus, if 100 samples of size 152 are
collected and 100 confidence interval are constructed, we expect 95 of them to contain
the true population parameter β1 . since the interval does not include the null –
hypothesized value of zero, we can reject the null hypothesis that true β1 is zero with
95 percent confidence. In other words, with the influence of other variables held
constant, the sleeping hours impacts on the BMI.
-The confidence interval for β2 is
-0.141613 ≤ β2 ≤ 0.026442
That is, the confidence interval [-0.141613 ; 0.026442] includes the true β2 coefficient
with 95% percent of confidence coefficient. Thus, if 100 samples of size 152 are
collected and 100 confidence interval are constructed, we expect 95 of them to contain
the true population parameter β2. Since the interval includes the null – hypothesized
value of zero, we cannot reject the null hypothesis that true β2 is zero with 95 percent
confidence. Therefore, β2 is not significant at the level of 5 percent.
-The confidence interval for β3 is
-3.86e-08 ≤ β3 ≤ 8.2e-08
That is, the confidence interval [-3.86e-08 ;8.2e-08] includes the true β3 coefficient with
95% percent of confidence coefficient. Thus, if 100 samples of size 152 are collected
15


and 100 confidence interval are constructed, we expect 95 of them to contain the true
population parameter β3 . since the interval does not include the null – hypothesized
value of zero, we can reject the null hypothesis that true β3 is zero with 95 percent
confidence. In other words, with the influence of other variables held constant, the
income per month impacts on the BMI.
-The confidence interval for β4 is
-0.09753 ≤ β4 ≤ 0.132273

Similarly, confidence interval of β4, given 95% level of confidence is [0.09753;0.132273]. When average minutes for doing exercise per day increase by 1 unit
and other things unchanged, the mean value of BMI will increase by a value within the
interval from -0.09753 to 0.132273. In other words, with the influence of other variables
held constant, the average minutes for doing exercise per day impacts on BMI.
-The confidence interval for β5 is
-0.0942064 ≤ β5 ≤

0.028732

That is, the confidence interval [-0.0942064 ; 0.028732] does not include the true β5
coefficient with 95% percent of confidence coefficient. Thus, if 100 samples of size
152 are collected and 100 confidence interval are constructed, we expect 95 of them
not to contain the true population parameter β5. Since the interval includes the null –
hypothesized value of zero, we cannot reject the null hypothesis that true β5 is zero
with 95 percent confidence. Therefore, β5 is not significant at the level of 5 percent.

16


IX.

CHAPTER III: HYPOTHESIS TESTING

9.1 P-value testing
Back to the stata table ( = 0.05):
-The coefficient sleeping hours has p-value = 0.000 < 0.05, so rejecting Ho, accepting H1.
Thus the coefficient sleeping hours is statistically significant at the 5% significance level.
-The coefficient, meals per day has p-value = 0.178 <0.05, so rejecting Ho, accepting H1.
Thus the coefficient meals per day is statistically significant at the 5% significance level.


-The coefficient, income per month has p-value = 0.478 <0.05, so rejecting Ho, accepting
H1. Thus the coefficient income per month is statistically significant at the 5% significance
level.
-The coefficient, average minutes for doing exercise per day has p-value = 0.766 >0.05,
so accepting Ho, rejecting H1. Thus the coefficient income per month is not statistically
significant at the 5% significance level.
-The coefficient, sex has p-value = 0.319 <0.05, so rejecting Ho, accepting H1. Thus the
coefficient sex is not statistically significant at the 5% significance level.

9.2 Heteroscedasticity testing:
H : homoscedasticity
0

H : unrestricted heteroscedasticity
1

Apply “imtest, white” command in STATA we obtain the result table as follow:
. imtest, white

White's test for Ho: homoskedasticity
against Ha: unrestricted heteroskedasticity

chi2(14)

=

Prob > chi2 =

16.19
0.3022


17


Cameron & Trivedi's decomposition of IM-test

--------------------------------------------------Source |
chi2
df
---------------------+-----------------------------

p

Heteroskedasticity |16.19

14 0.3022

Skewness |

8.59

4

Kurtosis |

76.23

1

0.0721

0.0000

---------------------+----------------------------Total |

101.01

19

0.0000

--------------------------------------------------Table…: Heteroskedasticity testing result



If Prob > chi < , then reject H
If Prob > chi > , then fail to reject H
2

0

2

0

The result from the table above shows that: P = Prob > chi = 0.03022, and α = 0.05
F



P < α, so that we reject H and accept H

F

0

2

1

We can conclude that this is a model has unrestricted heteroskedasticity.

9.3 Correlation between variables testing
We applied “vif ” command in STATA in order to test the correlationship among variables,
we obtained the result table as follow:
. vif

Variable |
VIF
1/VIF
-------------+ ---------------------meal |
exercise |

1.04 0.965463
1.03 0.969234

18


sleep |
income |


1.01 0.992842
1.00 0.997146

-------------+----------------------

Mean VIF |

1.02

As the result of table, the VIF value of all variables is smaller than 10, so that there would
be no collinearity between variables.

X.

RECOMMENDATIONS, DIFFICULTY AND
LIMITATION OF THE STUDY:

10.1 Recommendation
By constructing the regression model, we concluded that the variables included
in the index were statistically significant for the BMI, including: sleeping hours,
income per month, meals per day, sex and average minutes for doing exercise per day.
In addition, there are omitted variables, which means, in addition to the variables
presented there are some independent variables that impact on the BMI that were not
yet included in the model.
After analysing this data, based on the results obtained, the research team made
the following recommendations and measures to increase the BMI:
We recommend recording your food or calorie intake for a few days to
understand what you eating habits are truly like. It may be the reality check you need
to change your habits. Use whatever method you feel most comfortable with, whether
that’s writing it in a journal or using an app on your smartphone.

As with monitoring your food intake, you’ve got to know what your physical
activity level is like.
Not just plan exercises, we are going to exercising immediately and more.
Commit to walking for 20 minutes three times this week, and plan the days you’re
going to do it and what time — for instance after work on Mondays, Wednesdays, and
Fridays. And if something comes up, know that you can shorten it to 5 or 10 minutes
— everything counts.
Even if the weight doesn’t seem like it’s coming off fast enough, stay the course.
It’s only with consistent efforts to eat well, move more, and maintain other healthy

19


habits that affect weight (like getting enough sleep) that the pounds come off
permanently, research suggests.

10.2 Difficulty
Difficulty in choosing the topic: We had so many topics such as the factors
affect economic indicators, the factors impacting on the choice of adolescents in
services, and so on. But, we care more about health problems which is more and
more complicated. So the final topic was chosen based on its meaningful for young
people, in order that they can know the way to have a more confident body and a
good health and reduce risky by taking care of their health and body.
Difficulty in choosing the model: Our team had discussed and had run a trial
model to choose which model is the most suitable for this topic. The final answer is
to choose the linear regression model due to its simplicity and accuracy.
Difficulty in choosing the variables: choosing the variables which are suitable
for the topic is also very difficult. Taking the variables is not a hard problem but
deciding if when run model, the variables has capacity to impact on each other and
the model to survey is controversial. Because social characteristics of those variables,

It is also quite hard to decide if it is quantitative variable or qualitative variable.
Difficulty in surveying and conducting: Because of the limitation in condition
the survey was not clearly. The sample is quite small and limited in Ha Noi. So in
some aspect, the sample is not yet comprehensive
Difficulty in running the model: We met some technique problems. We were
wrong when run vif and because of our inexperience as well.

10.3 Limitation
Choosing variables: Some chosen variables are not very meaningful in real
life. When establishing the model, we have to eliminate some variables not necessary
and also might lack some important variables which have strong impact on this
model.
Data source: The sample size is not big enough and the level of accuracy is
still limited. These may affect the inclusiveness of model. We collect data of Body
Mass Index (BMI)- an indicator of body fatness and not all of us knew what is it so
it is our big limitation.

20


21


XI.

CONCLUSION
The analysis in this report has revealed that there is an inverse relationship
between sleeping hours, exercising hours, income and the BMI. This once again
proves the significant impact of the variables mentioned above on BMI. Obviously, if
we do not allocate the hours spent on sleeping and exercising, the money earned and

money spent on eating properly, the BMI will not in its usual number. As a
consequence, our body will be unhealthy and not in a good shape, we might be too
skinny, or we might be fat. Sleeping hours and exercising hours affect the BMI, the
more hours spent on sleeping and less numbers of meal we eat, the greater BMI will
be. This might be abnormal because in the reality, if we ear more, we might gain
weight and get higher, therefore, the BMI must increase. But this analysis show the
opposite. It might be explained as follow:
-The less meal we eat, but if each meal has a large proportion can cause the metabolic
system in the body to not working properly. So in the end, the weight rise and BMI
increase
-The sample size is not big enough to have a good result.
-Finally, income and expenditure on eating reflect the capacity of a person to pay for
eating. On average, the more money earned and the more money spent on
eating, the stronger impact on BMI. Taking all of these into consideration, a person
can adjust the BMI through a number of things such as sleeping hours, exercising
hours, income, and expenditure on eating. However, the conclusion above are also
subject to a number of limitations. First, it is unclear to what extent the results can be
generalized to other countries apart from the Viet Nam. Each countries has its own
weather, eating habits and people’s shape form which affect strongly on BMI.
Second, there may be other variables that affect the BMI such as genetic, working
conditions, family relationship and so on. Including these in the regression would
increase the precision of estimates as well as eliminate the potential of omitting the
bias variable. This investigation, however, is left for future research. Therefore, BMI
is an appropriate measure for screening for obesity and its health risks. Lastly,
the widespread and longstanding application of BMI contributes to its utility at the
population level. Its use has resulted in an increased availability of published
population data that allows public health professionals to make comparisons
across time, regions, and population subgroups.

22



XII.

REFERENCES

 Further detail about trends in obesity can be found in the Adult Health Trends report:

/> Health Survey for England, 2016: Summary of key findings:
/> BMI Wiki:

/>

Obesity Wiki:
/>


Limitations of BMI:
/>
 Metric of BMI:
/> Factors affecting BMI:
/>_Q_2013_20_6_956_964.pdf
 BMI on Child & Teen:
/>i.html

23


XIII. APPENDIX
Part A: Data

Table 3:

Explanation variables

NAME

TYPE

X

Independent
variable

Sleep hours

sleep

Time (hours)

X

Independent
variable

Meals per day

meal

Number of
meals


X

Independent
variable

Income per month

income

VND

X

Independent
variable

Average minutes for doing
exercise per day

exercise Time (minutes)

X

Independent
variable

Sex

sex


Y

Dependent
variable

BMI

bmi

1

2

3

4

5

Table 4: Table

sex

SIGH

UNIT

Males: 1;
females: 0


collected

bmi
1
1
1
0
1
0
0
1
1
0
1
0
1
1

EXPLANATION

20.51509
28.68514
21.20311
28.53746
20.8307
20.60378
21.05171
39.00119
27.5802

27.51338
18.50777
29.21841
22.76147
24.31412

height
1.78
1.67
1.71
1.45
1.78
1.83
1.51
1.45
1.66
1.64
1.66
1.48
1.61
1.66

weight
65
80
62
60
66
69
48

82
76
74
51
64
59
67

exercise income meal
1
0
0.75
0.25
2
2
0.75
1
0.25
2.25
0.25
0.5
0.25
0

500000
4500000
5000000
3000000
4500000
5000000

500000
2500000
2000000
5000000
3000000
1000000
0
4000000

sleep
1
1
3
2
3
1
4
2
1
4
2
1
1
2

5
7
5.5
7
5

5
5.5
10
7
7
4.5
7.5
5.5
6

24


×