Tải bản đầy đủ (.pdf) (10 trang)

introduction to spss RESEARCH METHODS & STATISTICS HANDBOOK PHẦN 3 pot

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (156.05 KB, 10 trang )


21
Double-click on the chart to move the histogram from the Chart Carousel Window to
a Chart Window. The menu bar and tool bar change to show editing facilities.

First, click on CHART then OPTIONS and NORMAL CURVE - then hit OK. The
normal curve superimposed over the histogram is the one for the above mean and
standard deviation. Admittedly, it‟s difficult to make a decision with such a small
sample, but does the curve appear to be a good fit to the histogram?

Now, click on the icon „swap axes‟. Does the histogram look better with vertical bars
or horizontal bars?

Now try some of the other icons and tools to change the chart. These changes require
the appropriate part of the chart to have been selected. Click on any bar. The bars will
become highlighted with small black squares at their corners. Then click on the Fill
Pattern - tool button (the rectangle with diagonal shading). To apply a pattern, click
on it and then click on apply. Once you have finished with the patterns, click on close.
Also, try the Colour Palette tool button (the one with the pen) and the Bar Labels icon
tool button (the one with the fingernails).

You can also change the style of the line showing the Normal curve, and the fill
pattern and colour of the background of the histogram. Once you have finished with
your work, select FILE and then SAVE CHART. Save your histogram as
artwork.chz

To copy or move a chart into Word click on EDIT and then select COPY the chart.
To move to Word minimise SPSS and open word. If Word is already open then press
ALT & TAB to move between programs. Once in Word, go to EDIT PASTE.
Finally, exit from SPSS for windows by selecting FILE EXIT



Section II: Manipulating the Data in the Matrix
(Computing, Recoding, Filtering and Deleting Data)


Computing Values

Start off SPSS and open the file family.sav (you should find this file on your M:
drive in the folder that you named survey). We shall use the COMPUTE command
to build up a new variable that will be labelled BMI, which stands for body mass
index. This is calculated as:

Body mass index = weight (pounds)/ height (inches)
2


Select TRANSFORM and then COMPUTE and set the Target Variable to bmi.
Click on Type & Label and enter the label body mass index in the label box. Click
continue to return to the Computer Variable dialog box. Using the source list on the
left and the calculator pad in the centre, build up


22
Weight * 0.4536 / (height * 0.0254) **2

in the numeric expression box. Run the completed command. The new variable is
added to the end of the data. We shall check the new variable by estimating a few
descriptive statistics using FREQUENCIES (via Analyze – Descriptive Statistics).
(Analyze – Descriptive Statistics – Explore would be a better command, but
Frequencies will do here).


Select ANALYZE, DESCRIPTIVE STATISTICS and then FREQUENCIES.
Move body mass index (bmi) to the Variable(s) box. Since bmi is a metric variable
with a potentially different value for every case in the data suppress frequency tables
by clearing the check box. Click on DISPLAY FREQUENCY TABLES. Now you
will get a message saying „You have turned off all output. Unless you request
Display Frequency Tables, Statistics or Charts, Frequencies will generate no output‟.
No worries, we will estimate descriptive statistics by clicking on STATISTICS and
clicking on the check boxes for the following: MEAN, MEDIAN, MINIMUM and
MAXIMUM. Run the command and look at the output.

 What are the sample values of the mean, median, minimum and maximum?

(The mean should be around 25.0. Any values outside the range15.0 to 35.0 should
be queried).

 Do the sample statistics satisfy these rough checks? If not, something is wrong!


Conditionally Computing Values

Now we shall use the IF sub-command (via Transform-Compute) to set up a new
variable. The sub-command allows you to set up a new variable under the condition
that the original variable, which it is based on, fulfils certain criteria. We want to set
up a new variable AGEHOH for the age of the head of the household. In other
words, If a person in the sample is head of the household, AGEHOH shall indicate
that person‟s age.

Select TRANSFORM and then COMPUTE and clear the previous settings by
clicking on RESET. Set the Target Variable to AGEHOH and click on TYPE &

LABEL to assign the label age head of household. Click on Continue, and then set
the Numeric Expression to AGE. We want this (i.e., the current age in years) to be
applied when the case is head of household, which occurs when RELTOHOH is zero.
(For the variable RELTOHOH – relationship to head of household – the value 0
denotes that a person is head of household). Select IF… and INCLUDE IF CASE
SATISFIES CONDITION. Set up the condition RELTOHOH = 0 in the large box
and run the command. The variable AGEHOH should now be added to the end of the
data. Have a look at the new variable. You should see ages set for some cases only.
Let‟s check AGEHOH by moving it in the data matrix to the column after
RELTOHOH so that we can see what happened more clearly.

First we must make a space in the data matrix by inserting a new variable. Find
RELTOHOH by either scrolling through the DATA EDITOR window or by

23
selecting UTILITIES and VARIABLES…. selecting RELTOHOH from the source
lists and then clicking on GO TO and CLOSE. Now click on any cell of the variable
that is immediately to the right of RELTOHOH (this variable should be sex). Then
select DATA and then INSERT VARIABLE. Alternatively, you can click on
INSERT VARIABLE tool (which is the sixth button from the right).

Now, a blank column headed var00001 containing system-missing values (dots) is
inserted before the selected variable. Move the AGEHOH to this column by single-
clicking on AGEHOH to highlight the column and then selecting EDIT and CUT.
To paste it in the desired location single-click on the head of the blank column
(var00001) and select EDIT and then PASTE.

Look at the values in the DATA EDITOR window.

 Do all heads of household have AGEHOH set? If not, what might be the reason?

(Hint: Look at the variable that agehoh is derived from!).

 What value is set for cases who are not heads of household?


Re-coding Values

The RECODE command in SPSS is very powerful and efficient but it can be a little
tricky to set up due to the number of clicks required. We shall recode BMI into a new
variable BMIGRP, which takes the values
Value Range Interpretation

1 bmi < 25.0 Okay
2 25.0  bmi < 30.0 Overweight
3 bmi  30.0 Obese

Select TRANSFORM and then RECODE and INTO DIFFERENT VARIABLES.
Select BMI from the source list into the central INPUT VARIABLE – OUTPUT
VARIABLE box. Enter BMIGRP into the Name box and click on Change to
complete the INPUT VARIABLE – OUTPUT VARIABLE box. Also enter a
suitable variable label for BMIGRP in the LABEL box (e.g., categorical body mass
index).

To set up the recoding, click on OLD and NEW VALUES….We build up the recode
specification for the third category of BMIGRP first. In the OLD VALUE box, select
RANGE and THROUGH HIGHEST and enter 30.0 in the box before THROUGH
HIGHEST. In the NEW VALUE section, enter 3 into the VALUE box. Then click
on ADD to copy the specification 30.0 THROUGH HIGHEST = 3 to the OLD –
NEW box. Build up the other two specifications, in order of 25.0 through 30.0 = 2
and LOWEST THROUGH 25.0 = 1. Now run the completed command.






24
To finish, double-click on BMIGRP in the Data Editor window, and define suitable
value labels (i.e., 1= okay, 2 = overweight, 3 = obese).

 Are the values of BMIGRP correct for the first ten cases?


Filtering Cases

In this example, we shall filter cases. The filtering option allows you to exclude
certain cases from further analysis temporarily.

Before filtering, generate a two-way frequency table for ownrent by typaccm by
selecting ANALYZE, then DESCRPTIVE STATISTICS and then CROSSTABS
and selecting ownrent for Row(s) and typaccm for column(s). Run the command and
look at the table in the output.

1. What exactly does the frequency count in the first cell of the second table refer to?
6 what?

We shall filter using the variable PERSNO, which is the number of persons in the
household.

2. What will be the effect of selecting cases satisfying the condition persno=1? What
is the impact on households?


Now, select DATA and SELECT CASES and then IF CONDITION IS
SATISFIED and make sure that UNSELECTED CASES are FILTERED (This is
very important as the alternative is DELETED, which we want to avoid now!)

Select IF… and build up the condition persno = 1 in the large box. Run the
completed command. Find persno in the data editor window.

3. What appears in the status bar when filtering is in effect? (The status bar is at the
bottom of the window)

4. What has happened to case numbers with persno ≠ 1?
Rerun the CROSSTABS command (via Analyse – Descriptive statistics) and look at
the new table in the output.

5. What exactly does the frequency count in the first cell refer to now? 3 What?

Go to the Data Editor Window and save the filtered data as familyf.sav. Then select
DATA, SELECT CASES and then ALL CASES. Run the command.

6. What happens to the status bar and the case numbers?






25
Deleting Cases


Instead of filtering cases we shall delete unselected cases without doing any harm to
data stored in disk system files. Select DATA, SELECT CASES, IF CONDITION
IS SATISFIED which picks up the previous condition on persno = 1. Then select
UNSELECTED CASES are DELETED. Run the command and have a look at the
Data Editor Window.

1. How many cases are left?

2. What are the values of PERSNO?

3. What are the values of HSEMO? What does that successfully show?

Now, rerun the CROSSTABS command in the previous section and look at the
output.

4. Do the results agree with those obtained when cases are filtered?

Return to the Data Editor Window and save the selected cases to a NEW system file
named familyd.sav (after deleting cases you should do this as soon as possible to
avoid overwriting your complete data file by accident).

Finally, re-open familyf.sav, the filtered file you saved from the previous section

5. Is filtering still on?

Exit from SPSS, saving the contents of the output window into output3.spo
Open up family.sav that you saved to your survey folder.




26
WEEK 3: October 17
th

T-Tests



Section I: Parametric T-tests (related & unrelated)

This practical will show you how to run a t-test so that you can look at the difference
between means of two scores.

Experimental designs can be of two basic types – within subject (dependent or
related) and between subject (independent or unrelated). The former is when all
subjects are subjected to all conditions (e.g., testing reaction times before and after
receiving a drug). Between subject designs are when you divide subjects into
independent groups, such as on the basis of gender, or into one group that receives a
drug, and a second that receives a placebo.

DEPENDENT OR RELATED SAMPLES T-TEST

First, a quick review of the test layouts.

1. Related Samples - two variables, one for each condition of the experiment. Each
subject has two scores, as a result:


Variable 1 (First set of scores for
the subjects, e.g. reaction time

before taking the drug)
Variable 2 (Second set of scores
for the subjects, e.g. reaction time
after taking the drug)
Sub. No.


1
10
30
2
11
31
3
12
32
4
10
30
5
9
29.

2. Independent or Unrelated Samples - two variables, the first tells SPSS what
condition EACH subject belongs to, the second is the actual score for that subject:




Variable 1 (what condition each

subject belongs to, e.g. group 1 are
the controls, group 2 receive the
drug)
Variable 2 (actual score, e.g. each
subject‟s reaction time)
Sub. No.


1 (control)
subject‟s condition (1)
subject 1 score
2 (control)
1
subject 2 score
3 (experimental)
2
etc.
4 (experimental)
2
etc.


27

T-Test for Related Sample

This is the parametric comparison of two related groups, for example, when you want
to compare mean scores for subjects at some task before and after taking a drug. Each
set of subject scores for the related t-test must be entered as an individual variable in
SPSS. So, in the above example, all the individual(s) scores for the task before taking

the drug would be in one column and all the scores after taking the drug in another.

First, open family.sav. The next step is to add a variable to the data file, so that we can
run the related t-test. In this case, the comparison will be between the subjects‟
height/weight ratio before they were put on a 4-week diet/exercise plan and after. The
variable already in the data set HWRATIO is the measure before. At the end of the
data file, add the variable HWRATIO2 to represent their measurements after the plan.
Using what you learned in the first lesson about entering data, create the new variable
using the information below:

Variable Name: HWRATIO2
Variable Label: Height/Weight Ratio after plan
Data: see table 1 below

To run the procedure, go ANALYZE, COMPARE MEANS and then PAIRED-
SAMPLES T-TEST

The usual dialogue box appears. The dialogue box has the two-column format. The
only difference is that you must select pairs of variables and move them across, rather
than just one variable at a time. To do this, you have to click on one variable, then
locate the other variable and click on it. The two variables that you have requested
should appear in the current selection box. After clicking on both, you then press the
arrow button to move the pair across. SPSS will analyse each pair to determine if their
means are significantly different statistically. In this case, select the variables
HWRATIO and HWRATIO2 and move them across, then press the OK button.

Table 1: Data for Height/Weight Ratio after a 4-week diet/exercise plan

Subject Number
HWRATIO2 score

1
.44
2
.52
3
.46
4
.
5
.44
6
.42
7
.33
8
.74
9
.80
10
.32
11
.60
12
.65
13
.40

28
14
.50

15
.57
16
.41
17
.60
18
.55
19
.49
20
.60


OUTPUT

The results appear in three sections

 The first section gives you a table called Paired Samples Statistics with the mean
scores, standard deviations and standard error mean for the two variables.
 The second section is a table called Paired Samples Correlation(s) showing the
correlation between the two variables and the level of significance
 The third section is more important. The table called Paired Samples Test
indicates the significance of the results. This includes the t-value, degrees of
freedom (d.f.) and the two-tailed significance level.

What is the t-value for the comparison between the height to weight ratio scores?

Is there a significant difference between the scores before and after the diet/exercise
plan? If so, which is the greater height/weight ratio?



T-Test for Independent Samples

This is the parametric t-test for two independent samples - a between-subjects design
where, for example, subjects are randomly assigned to two separate test conditions
(e.g. drug and control) and the mean scores (e.g. reaction time) are compared to
determine if they are significantly different from each other.

In this case, you want to test whether there is a statistical difference in weight to
height ratios between the male and female subjects. The format for variables to be
used in the independent t-test is different from that used in the related. Instead of the
scores being placed in two separate columns (variables), all of the scores are placed in
a single column (variable). A second variable identifies for SPSS which of the two
groups each score belongs to. So, in this case, there is the variable HWRATIO2 as the
dependent variable and NSEX as the independent variable.

To run the analysis, go to ANALYZE, COMPARE MEANS and then
INDEPENDENT-SAMPLES T-TEST. As usual, the left column lists all the
variables in your data file. On the right, there are two boxes:

 The test variable(s) box is where you move the dependent variable(s). (e.g.,
HWRATIO2)

29
 The grouping variable box is where you move the variable that distinguishes
between the two independent groups (e.g. the variable NSEX)

First, select the dependent variable HWRATION2 and move it over to the test
variable(s) section. Next move NSEX over into the grouping Variable section and

press the DEFINE GROUPS button. Values from the grouping variable must be
entered into the two boxes. In the case of the variable sex, where only two levels are
recorded, you would just enter “1" in the top box for male subjects, and “2" in the
lower one for female subjects. Hit the CONTINUE button, then hit the OK button.

[Note: There may be times where you have a larger range of values, such as five
different education levels, but only want to look at the difference between two of
them. You would enter the two values you wish to compare.]

OUTPUT

There are two sections:

 The first section of the output gives you a table called Group Statistics which
indicates the number of cases and the mean scores etc. for each condition.

 The second section provides a table called Independent Samples T-test and
starts with Levene‟s Test for Equality of Variance. If the variance is unequal and
is indicated by significant difference, then when you look at the results of the t-
test in the final table, you use the line starting with Equal variances not assumed.
If it isn‟t significant, you look at the line starting with Equal variances assumed.
The final table gives you t-values, degrees of freedom and the two-tailed
significance levels.

In this case, Levene‟s is not significant (0.137), so we look at the equal variance line.
In this case, it is not significant (two-tailed significance of .478), so we reject the
hypothesis that there is a difference between males and females in their height to
weight ratios.



Section II: Non-Parametric T-tests (Wilcoxon - related & Mann-
Whitney - unrelated)


All of the tests today can be found under ANALYZE, NONPARAMETRIC TESTS


Mann-Whitney - Unrelated

This is the non-parametric t-test for two independent samples - a between-subjects
design. To run the analysis, choose: ANALYZE, NONPARAMETRIC TESTS, and
2 INDEPENDENT SAMPLES

As usual, the left column lists all the variables in your data file. On the right, there are
two boxes:

30

 the “test variable(s)” box is where you move the dependent variable(s)
 the “grouping variable” box is where you move the variable that distinguishes
between the two independent groups (e.g. the variable sex)

So, move HWRATIO2 into the test variable box, and move NSEX into the grouping
variable box. Now, click the Define Groups button. Values from the grouping variable
must be entered into the two boxes. In the case of the variable NSEX, you enter “1" in
the top box for male subjects, and “2" in the lower one for female subjects. Hit the
Continue button, then hit the Ok button.

OUTPUT


SPSS divides the entire set of subjects into three groups:

 those with a score of 1 (male)
 those with a score of 2 (female)
 cases with missing data, which are excluded from the analysis)

The first section gives the mean ranks for the two conditions that are included, as well
as the sums of the ranks and the numbers of cases

The second section gives the Z score and p-values for the T-test.

Is there a difference between males and females? How do the results from this week
compare to last week‟s?


Wilcoxon - Related

This is the non-parametric repeated measures T-test, in a within subjects design. Like
the parametric equivalent, we‟ll be running a comparison of height to weight ratios for
the sample population before and after a four-week exercise/diet program. To run the
analysis, choose: ANALYZE, NONPARAMETRIC TESTS, and 2 RELATED
SAMPLES


The dialogue box has the two-column format. The only difference is that you must
select pairs of variables and move them across. SPSS will analyse each pair to
determine if their mean ranks are significantly different statistically. For this analysis,
select the two variables HWRATIO and HWRATIO2, then click the Ok button.

OUTPUT


The output for this procedure is quite different from the parametric test. The first
section gives you information about how many rank scores for one condition are

less than (LT)
greater than (GT)
equal to (EQ)

×