Tải bản đầy đủ (.pdf) (10 trang)

introduction to spss RESEARCH METHODS & STATISTICS HANDBOOK PHẦN 4 pdf

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (143.89 KB, 10 trang )


31

the ranks scores for the other condition. The mean ranks for each of these three levels
are given, as well as the sums of the ranks for each and the number of cases that fall
under each level.

The main results are underneath this table, where the Z value and the p value are
given. The usual standard for levels of significance is used (if p is less than 0.05).

How many cases are there where HWRATIO is greater than HWRATIO2?

Is there a significant difference between ranked height/weight ratios before and after
the exercise/diet program?



32
WEEK 4: October 24
th
ANOVAS

This practical will involve familiarising students with the analysis of variance
(ANOVA). The ANOVAs used in this practical are when you may want to determine
if there is a significant difference between three or more groups when you have only a
single variable.


One-way ANOVA for Independent Samples

In this case, we want to determine if there is a significant difference in the height to


weight ratio between the three age groups in the sample in family.sav - children,
adults and elderly. We also want to carry out a Tukey‟s post-hoc test to identify where
those difference lie, if any. The procedure is remarkably similar to carrying out an
unrelated samples t-test. Go: ANALYZE, COMPARE MEANS, ONE-WAY
ANOVA

As you can see, the layout of the dialogue box is basically the same as the one for
unrelated t-tests from last week. First select your Dependent variable(s) - in this case
move the variable HWRATIO into the dependent list section. Your factor
(independent variable) is the variable AGEGRP. Press the Continue button.

Before running the analysis, press the Post-hoc button and turn on the Tukey‟s test.
Now press the Continue and Ok buttons and the analysis will be carried out.

OUTPUT

There are two sections to the results for the one-way ANOVA.

1. The first section indicates whether any significant differences exist between the
different levels of the independent variable. The between groups, within groups,
sums of squares are listed, degrees of freedom, the F-ratio and the F-probability
score (significance level). It is this last part that indicates significance. If the F-
prob. is less than 0.05 than a significant difference exist. In this case, the F-prob.
is 0.000, so we can say that there is a statistically significant difference in height
to weight ratios between the three age groups.

2. The post-hoc test identifies where exactly those difference lie. The final part of the
second section is a small table with the levels of the independent variable listed
down the side. Looking at the comparisons between these levels we see that
children have a significantly higher mean height to weight ratio than adults and

the elderly (this is also indicated by the asterixes).

For the meantime, ignore the third table of the output.





33

One-way ANOVA for Related Samples

The procedure for running this is very different from anything you‟ve done before.
The first step is easy enough - you need to add a third height to weight ratio variable,
representing the ratios for the subjects some time after they stopped doing the
exercise/diet plan. The data is below:

Variable Name: HWRATIO3
Variable Label: Height/Weight Ratio post-plan
Data: see table below

Subject Number
HWRATIO3 score
1
.42
2
.56
3
.42
4

.
5
.41
6
.40
7
.30
8
.78
9
.71
10
.30
11
.55
12
.64
13
.40
14
.49
15
.55
16
.39
17
.52
18
.54
19

.49
20
.60


The first step is to run a single factor ANOVA by going: ANALYZE, GENERAL
LINEAR MODEL, REPEATED MEASURES

The dialogue box is different from the usual format. The first step is to give a name to
the factor being analysed, basically the thing the three variables have in common. All
three variables cover height to weight ratios, so

 in the “With-in Subject Factor Name:” box type RATIO.
 in the Number of Levels box, type 3 (representing the three variables)
 press the Add button, then the Define button

The next dialogue box is a bit more familiar. In the right-hand column, there are three
question marks with a number beside each. Select each of the three variables to be
included in the analysis, and move them across with the arrow button. Notice how
each of the variables replaces one of the question marks, indicating to SPSS which

34
three variables represent the three levels of the factor RATIO. Then proceed by
clicking on OK.

OUTPUT

Firstly, you can ignore the sections of the output titled “Multivariate Tests” and
“Mauchly‟s Test of Sphericity”.


You need to examine the section titled “Tests of Within-Subjects Effects”. This
section indicates whether any significant differences exist between the different levels
of the within subjects variable. The degrees of freedom and sums of squares are listed,
as well as the F-score and its significance level. If the significance level is less than
0.05 than a significant difference exist. In this case, it is 0.001 (look at the measure for
sphericity assumed), so we can say that there is a statistically significant difference in
height to weight ratios between the three times when measurement were taken.

You can ignore the section titled “Tests of Between-Subjects Effects”. It is irrelevant
here.

To do a post-hoc test to identify where the differences lie, the SPSS for Windows
made easy manual recommends doing Paired-Sample T-tests. In this case

HWRATIO & HWRATIO2
HWRATIO & HWRATIO3
HWRATIO2 & HWRATIO3

From these three T-tests, you can determine which of the height to weight ratios are
significantly different from each other.



Kruskall-Wallis ANOVA (KWANOVA – Unrelated)

This is similar to the non parametric independent ANOVA, where ranks are used
instead of the actual scores. We will run the analysis on the same variables, so go
ANALYZE, NONPARAMETRIC TESTS, and K INDEPENDENT SAMPLES

As with the parametric test, move HWRATIO over to the test (dependent variable list

and AGEGRP over to the Grouping (independent) variable list, and define the group
with a minimum of 1 and a maximum of 3. Click the Ok button. Notice that the non
parametric ANOVA doesn‟t have a post-hoc test. If you run this ANOVA, you‟ll
have to consult a statistics book as to how to do a post hoc on the results. One way
would be to run a series of t-tests on all the combinations of the conditions.

OUTPUT

The first section gives you the mean ranks and the number of cases for each level of
the independent variable. The second section lists the Chi-Square value, degrees of
freedom and significance of the test.


35
Is there a significant difference between the three groups (remember you can‟t say
exactly what that difference is without a post hoc test)?


Friedman‟s - Related ANOVAs

This is similar to the nonparametric related samples ANOVA, where ranks are used
instead of the actual scores. We will run the analysis on the same variables, so go
ANALYZE, NONPARAMETRIC TESTS, and K RELATED SAMPLES

This is much easier to run - just move the three variables (HWRATIO, HWRATIO2
and HWRATIO3) over to the right column and click OK.

OUTPUT

There is the Chi-square score, the d.f. and whether it‟s significant (as usual, has to be

less than 0.05). Again, for post-hoc tests, you‟ll probably have to consult a statistics
book or possibly run three non-parametric related samples T-tests.



36
WEEK 5: 30
th
October
Study Week





WEEK 6: November 6
th

No Practical




37
WEEK 7: November 14
th

QUALITATIVE RESEARCH: STUDENT SEMINAR
PRESENTATION PREPARATION


Students should use this time to prepare work for their presentations. Dr. Alison will be
available in his office for guidance if necessary.

WEEK 8: November 21st
QUALITATIVE RESEARCH: STUDENT SEMINAR


WEEK 9: November 28
th

INTERVIEWING AND DISCOURSE ANALYSIS
conductig interviews etc

This period should be used to conduct interviews in preparation for the session on
content analysis. Students are expected to conduct interviews or sessions that result in
naturally occurring language. It is important that this material is transcribed in
preparation for week 11‟s session. Dr. Alison will be available for consultation.


WEEK 10: December 5
th

WORKING WITH NATURALLY OCCURING
LANGUAGE
PREPARATION

Students will use this period to work with their material gathered in the previous
sessions. They should use this time to prepare for presentations in the final practical
session (12
th

December).



WEEK 11: December 12
th

WORKING WITH NATURALLY OCCURING
LANGUAGE: STUDENT SEMINAR


Students are expected to organise their own seminar presentations in this session on
the results and methods employed regarding the content analysis of their material.




38










SECTION III


EXTRA MATERIAL




39
For the benefit of students who wish to follow up other procedures in their own time,
we have included the following section which gives you some opportunity to play
with graphics packages and explore some issues associated with regression in
preparation for next term. Try not to worry if this all sounds unfamiliar at first. This
section is simply to give you a running start when it comes to your work after
Christmas.


REGRESSION


Simple Regression

In simple regression, the values of one variable (the dependent variable (y in this
case)) are estimated from those of another (the independent variable (x in this case))
by a linear (straight line) equation of the general form:

y‟=b
o
+b
1
(x)

where y‟ is the estimated value of y, b

1
is the slope (known as the regression
coefficient)
and b
o
is the intercept (known as the regression constant).

Multiple Regression

In multiple regression the values of one variable (the dependent variable (y)) are
estimated form those of two or more variables (the independent variables (x
1
,
x
2
,…,x
n
)). This is achieved by the construction of a linear equation of the general
form:

y‟=b
o
+b
1
(x
1
)+b
2
(x
2

)+…+b
n
(x
n
)

where the parameters b
1
,b
2
,…,b
n
are the partial regression coefficients and the
intercept b
o
is the regression constant.

Residuals

When a regression equation is used to estimate the values of a variable (y) from those
of one or more independent variables (x), the estimates (y‟) will not be totally
accurate (i.e., the data points will not fall precisely on the straight line). The
discrepancies between y (the actual values) and y‟ (the estimated values) are known
as residuals and are used as a measure of accuracy of the estimates and of the extent
to which the regression model gives a good account of the data in question.


40
The multiple correlation coefficient


One measure of the efficacy of regression for the prediction of y is the Pearson
correlation between the true values of the target variable y and the estimates y‟
obtained by substituting the corresponding values of x into the regression equation.
The correlation between y and y‟ is known as the multiple correlation coefficient (R
(versus r which is Pearson‟s (the correlation between the target variable and any one
independent variable)). In simple regression R takes the absolute value of r between
the target variable and the independent variable (so if r=-0.87 than R=0.87).

Running Simple Regression

Using the family.sav file we want to look at how accurately we can estimate height to
weight ratios (HWRATIO) using the subject‟s age (AGE). To run a simple
regression, choose ANALYSE, REGRESSION and LINEAR.

As usual, the left column lists all the variables in your data file. There are two sections
for variables on the right. The “Dependent” box is where you move the dependent
variable. Move HWRATIO there. The “Independent(s)” box is where you move AGE.

 Next click the STATISTICS button, and turn on the “Descriptive” option.

 As already states, a residual is the difference between the actual value of the
dependent variable and its predicted value using the regression equation. Analysis
of the residuals gives a measure of how good the prediction is and whether there
are any cases that should be considered outliers and therefore dropped from the
analysis. Click on “Case-wise diagnostics” to obtain a listing of any exceptionally
large residuals.

 Now click on CONTINUE.

 Now click on the PLOTS button. Since systematic patterns between the predicted

values and the residuals can indicate possible violations of the assumption of
linearity you should plot the standardised residuals against the standardised
predicted values. To do this transfer *ZRESID into the Y: box and *ZPRED into
the X: box and then Click CONTINUE.

 Now click Ok.

Output

The first thing to consider is whether your data contains any outliers. There are no
outliers in this data. If there were this would be indicated in a table labelled
“Casewise Diagnostics” and the cases that corresponded to these outliers would have
to be removed from your data file using the filter option you learned previously.

With that out of the way, the first table (Descriptive Statistics) to look at is right at the
top. The first part gives the means and standard deviations for the two variables (e.g.

×