Statistics for Economics,
Accounting and Business Studies
fourth edition
Michael Barrow
Additional student support at
www.pearsoned.co.uk/barrow
Additional student support at
www.pearsoned.co.uk/barrow
9 780273 683087
ISBN 0-273-68308-X
www.pearson-books.com
An imprint of
Statistics for Economics,
Accounting and Business Studies
Barrow
fourth
edition
New to this edition:
More worked examples and real life business
applications show students how to use the
various techniques
Section exercises and end of chapter problems
allow for practice and testing
Chapters have been reorganised, making the
order more logical and flexible
Features:
Assumes no prior knowledge of statistics
or advanced level mathematics
Numerous real-life examples, problems
and applications are included, some
based on Excel
Use of computing in statistics is
explained and illustrated using industry-
based software, databases, etc.
Boxes highlight interesting issues, common
mistakes and give advice on using
computers in statistical analysis
A website accompanies the book with
resources for students and instructors
This fourth edition of Statistics for Economics, Accounting and Business Studies is written to provide a clear and
concise introduction to a range of statistical concepts and techniques. Throughout the text the author highlights
how and why these techniques can be used to solve real-life problems, ensuring that the material is relevant
to the experience of the student.
This is a core text for introductory courses in statistics
at undergraduate and MBA level. The book will be
particularly suitable for economics and accounting
students and will also appeal to those taking courses
in business studies.
Michael Barrow is Senior Lecturer in Economics at the
University of Sussex and has acted as a consultant for
major industrial, commercial and governmental bodies.
Statistics for Economics,
Accounting and Business Studies
fourth edition
Michael Barrow
‘An excellent reference book for the undergraduate student; filled
with examples and applications – both practical (i.e. computer
based) and traditional (i.e. pen and paper problems); wide-ranging
and sensibly ordered. The book is clearly written, easy to follow…
yet not in the least patronising. This is a particular strength.’
Christopher Gerry, UCL
Front cover image:
© Getty Images
027368308X_COVER 3/23/07 12:29 PM Page 1
Statistics for Economics,
Accounting and Business Studies
Visit the Statistics for Economics, Accounting and
Business Studies, fourth edition Companion Website at
www.pearsoned.co.uk/barrow to find valuable student
learning material including:
n Additional explanation of some topics with further
references
n Excel worksheets and demonstrations
n Annotated links to other useful websites
SFE_A01.qxd 3/23/07 10:06 AM Page i
We work with leading authors to develop the strongest
educational materials in Accounting, bringing cutting-edge
thinking and best learning practice to a global market.
Under a range of well-known imprints, including
Financial Times Prentice Hall, we craft high quality print
and electronic publications which help readers to
understand and apply their content, whether studying
or at work.
To find out more about the complete range of our
publishing, please visit us on the World Wide Web at:
www.pearsoned.co.uk
SFE_A01.qxd 3/23/07 10:06 AM Page ii
Fourth Edition
Statistics for
Economics, Accounting
and Business Studies
Michael Barrow
University of Sussex
SFE_A01.qxd 3/23/07 10:06 AM Page iii
Pearson Education Limited
Edinburgh Gate
Harlow
Essex CM20 2JE
England
and Associated Companies throughout the world
Visit us on the World Wide Web at:
www.pearsoned.co.uk
First published 1988
Fourth edition published 2006
© Pearson Education Limited 1988, 2006
The right of Michael Barrow to be identified as author of this work has been
asserted by him in accordance with the Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this publication may be reproduced, stored
in a retrieval system, or transmitted in any form or by any means, electronic,
mechanical, photocopying, recording or otherwise, without either the prior
written permission of the publisher or a licence permitting restricted copying
in the United Kingdom issued by the Copyright Licensing Agency Ltd,
90 Tottenham Court Road, London W1T 4LP.
All trademarks used herein are the property of their respective owners. The use
of any trademark in this text does not vest in the author or publisher any trademark
ownership rights in such trademarks, nor does the use of such trademarks imply any
affiliation with or endorsement of this book by such owners.
ISBN: 978-0-273-68308-7
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library
Library of Congress Cataloging-in-Publication Data
Barrow, Michael.
Statistics for economics, accounting, and business studies / Michael Barrow.— 4th ed.
p. cm.
Includes index.
ISBN-13: 978-0-273-68308-7
ISBN-10: 0-273-68308-X
1. Economics–—Statistical methods. 2. Commercial statistics. I. Title.
HB137.B37 2006
519.5024'33—dc22
2005056640
109876543
10 09 08 07 06
Typeset in 9/12pt Stone Serif by 35
Printed and bound in Malaysia.
The publisher’s policy is to use paper manufactured from sustainable forests.
SFE_A01.qxd 3/23/07 10:06 AM Page iv
For Patricia, Caroline and Nicolas
SFE_A01.qxd 3/23/07 10:06 AM Page v
SFE_A01.qxd 3/23/07 10:06 AM Page vi
Contents
Preface to the fourth edition xiii
Introduction 1
1 Descriptive statistics
7
Learning outcomes 8
Introduction 8
Summarising data using graphical techniques 10
Looking at cross-section data: wealth in the UK in 2001 15
Summarising data using numerical techniques 23
The box and whiskers diagram 41
Time-series data: investment expenditures 1970–2002 42
Graphing bivariate data: the scatter diagram 55
Data transformations 57
Guidance to the student: how to measure your progress 59
Summary 59
Key terms and concepts 60
Problems 60
Reference 66
Answers to exercises 66
Appendix 1A:
ΣΣ
notation 70
Problems on
ΣΣ
notation 71
Appendix 1B: E and V operators 72
Appendix 1C: Using logarithms 73
Problems on logarithms 74
2 Probability 75
Learning outcomes 75
Probability theory and statistical inference 76
The definition of probability 76
Probability theory: the building blocks 78
Bayes’ theorem 86
Decision analysis 88
Summary 92
Key terms and concepts 93
Problems 93
Answers to exercises 98
SFE_A01.qxd 3/23/07 10:06 AM Page vii
viii Contents
3 Probability distributions 101
Learning outcomes 101
Introduction 101
Random variables 102
The Binomial distribution 103
The Normal distribution 109
The sample mean as a Normally distributed variable 116
The relationship between the Binomial and Normal distributions 122
The Poisson distribution 123
Summary 126
Key terms and concepts 126
Problems 126
Answers to exercises 131
4 Estimation and confidence intervals 133
Learning outcomes 133
Introduction 134
Point and interval estimation 134
Rules and criteria for finding estimates 135
Estimation with large samples 138
Precisely what is a confidence interval? 141
Estimation with small samples: the t distribution 148
Summary 153
Key terms and concepts 154
Problems 154
Answers to exercises 156
Appendix: Derivations of sampling distributions 158
5 Hypothesis testing
159
Learning outcomes 159
Introduction 160
The concepts of hypothesis testing 160
The Prob-value approach 167
Significance, effect size and power 168
Further hypothesis tests 169
Hypothesis tests with small samples 174
Are the test procedures valid? 176
Hypothesis tests and confidence intervals 176
Independent and dependent samples 177
Discussion of hypothesis testing 180
Summary 182
Key terms and concepts 182
Problems 183
Reference 186
Answers to exercises 187
SFE_A01.qxd 3/23/07 10:06 AM Page viii
6 The
χχ
2
and F distributions 190
Learning outcomes 190
Introduction 190
The
χ
2
distribution 191
The F distribution 205
Analysis of variance 207
Summary 214
Key terms and concepts 214
Problems 214
Answers to exercises 217
Appendix: Use of
χχ
2
and F distribution tables 219
7 Correlation and regression 220
Learning outcomes 220
Introduction 221
What determines the birth rate in developing countries? 221
Correlation 223
Regression analysis 232
Inference in the regression model 238
Summary 251
Key terms and concepts 252
Problems 252
References 255
Answers to exercises 255
8 Multiple regression
258
Learning outcomes 258
Introduction 259
Principles of multiple regression 260
What determines imports into the UK? 261
Finding the right model 278
Summary 285
Key terms and concepts 286
Problems 286
References 289
Answers to exercises 290
9 Data collection and sampling methods 295
Learning outcomes 295
Introduction 296
Using secondary data sources 296
Using electronic sources of data 298
Collecting primary data 300
The meaning of random sampling 301
Contents ix
SFE_A01.qxd 3/23/07 10:06 AM Page ix
x Contents
Calculating the required sample size 309
Collecting the sample 310
Case study: the UK Expenditure and Food Survey 313
Summary 314
Key terms and concepts 315
Problems 315
References 316
10 Index numbers
317
Learning outcomes 318
Introduction 318
A simple index number 318
A price index with more than one commodity 320
Using expenditures as weights 327
Quantity and expenditure indices 329
The Retail Price Index 334
Inequality indices 339
The Lorenz curve 340
The Gini coefficient 342
Concentration ratios 346
Summary 348
Key terms and concepts 349
Problems 349
Reference 354
Answers to exercises 354
Appendix: Deriving the expenditure share form of
the Laspeyres price index 357
Important formulae used in this book 359
Appendix: Tables 364
Table A1 Random number table 364
Table A2 The standard Normal distribution 366
Table A3 Percentage points of the t distribution 367
Table A4 Critical values of the
χ
2
distribution 368
Table A5(a) Critical values of the F distribution (upper 5% points) 370
Table A5(b) Critical values of the F distribution (upper 2.5% points) 372
Table A5(c) Critical values of the F distribution (upper 1% points) 374
Table A5(d) Critical values of the F distribution (upper 0.5% points) 376
Table A6 Critical values of Spearman’s rank correlation coefficient 378
Table A7 Critical values for the Durbin–Watson test at 5% 379
significance level
Answers to problems 380
Index 394
SFE_A01.qxd 3/23/07 10:06 AM Page x
Supporting resources
Visit www.pearsoned.co.uk/barrow to find valuable online resources
Companion Website for students
n Additional explanation of some topics with further references
n Excel worksheets and demonstrations
n Annotated links to other useful websites
For instructors
n Complete downloadable Instructors Manual, providing answers and
commentary on exercises
n Downloadable PowerPoint slides
For more information please contact your local Pearson Education sales
representative or visit www.pearsoned.co.uk/barrow
SFE_A01.qxd 3/23/07 10:06 AM Page xi
SFE_A01.qxd 3/23/07 10:06 AM Page xii
Preface to the fourth edition
This text is aimed at students of economics and the closely related disciplines
of accountancy and business, and provides examples and problems relevant to
those subjects, using real data where possible. The book is at an elementary
level and requires no prior knowledge of statistics, nor advanced mathematics.
For those with a weak mathematical background and in need of some revision,
some recommended texts are given at the end of this preface.
This is not a cookbook of statistical recipes; it covers all the relevant con-
cepts so that an understanding of why a particular statistical test should be used
is gained. These concepts are introduced naturally in the course of the text as
they are required, rather than having sections to themselves. The book can
form the basis of a one- or two-term course, depending upon the intensity of
the teaching.
As well as explaining statistical concepts and methods, the different schools
of thought about statistical methodology are discussed, giving the reader some
insight into some of the debates that have taken place in the subject. The book
uses the methods of classical statistical analysis, for which some justification is
given in Chapter 5, as well as presenting criticisms which have been made of
these methods.
There have been some substantial changes to this edition in the light of my
own experience and comments from students and reviewers. There has been
some rearrangement of the chapters of the book, although the content remains
similar with a few changes to encourage better learning of the subject. The
main changes are:
n The old Chapters 2 (Index numbers) and 7 (Data collection and sampling
methods) have been moved to the end of the book. This allows a continuous
development from descriptive statistics, through probability concepts, to sta-
tistical inference in the first part of the book. This will suit many courses
which concentrate on the use of statistics and which do not wish to focus on
data collection. Index numbers and data collection now form the final two
chapters which may be thought of as covering the collection and prepara-
tion of data.
n The previous edition’s final chapter on time-series methods (covering sea-
sonal adjustment) has been dropped, but this chapter is available on the
website for those who wish to make use of it. It was apparent that not many
teachers used this chapter, so it has been dropped in order to keep the book
relatively concise.
n In most chapters, exercises have been added within the chapter, at the end
of each section, so that students can check that they have understood the
material (answers are at the end of each chapter). The previous edition’s
exercises (at the end of each chapter) are renamed ‘Problems’ and are mostly
Changes in this
edition
SFE_A01.qxd 3/23/07 10:06 AM Page xiii
xiv Preface to the fourth edition
unchanged (with answers to odd-numbered problems at the end of the
book). The new exercises are relatively straightforward and usually require
the student to replicate the calculations in the text, but using different data.
There is thus a distinction drawn between the exercises which check under-
standing and the problems which encourage deeper thinking and discussion.
n Some of the more challenging problems are indicated by highlighting the
problem number in colour. This warns that the problem might require
some additional insight or effort to solve, beyond what is learned from the
text. This may be because a proof or demonstration is demanded, or that the
problem is open-ended and requires interpretation.
n In a few places I have included some worked examples, but, in general, most
of the book uses examples to explain the various techniques. The new exer-
cises may be treated as worked examples if desired, as worked-out answers
are given at the end of each chapter.
n Where appropriate, the examples used in the text have been updated using
more recent data.
n There is a website (www.pearsoned.co.uk/barrow) accompanying the text.
For this edition the website contains:
– Powerpoint slides for lecturers to use (these contain most of the key
tables, formulae and diagrams, but omit the text). Lecturers can adapt
these for their own use.
– An instructor’s manual giving hints and guidance on some of the teaching
issues, including those that come up in response to some of the problems.
– Answers to even-numbered problems (available to lecturers).
– The chapter on seasonal adjustment of time-series data, mentioned above.
No more than elementary algebra is assumed in this text, any extensions being
covered as they are needed in the book. It is helpful if students are comfortable
manipulating equations so if some revision is required I recommend one of the
following books:
I. Jacques, Mathematics for Economics and Business, Prentice Hall, 2003.
E.T. Dowling, Mathematics for Economists, Schaum’s Outline Series in
Economics, McGraw-Hill, 1986.
J. Black and J. Bradley, Essential Mathematics for Economists, 2nd edn, Wiley,
1980.
Author acknowledgements
I would like to thank the anonymous reviewers who made suggestions for this
new edition and the many colleagues and students who have passed on
comments or pointed out errors or omissions in previous editions. I would like
to thank all those at Pearson Education who have encouraged me, responded
to my various queries and reminded me of impending deadlines! Finally, I
would like to thank my family for giving me encouragement and the time to
complete this new edition.
Mathematics
requirements and
texts
SFE_A01.qxd 3/23/07 10:06 AM Page xiv
Publisher’s acknowledgements
We are grateful to the following for permission to reproduce copyright material:
Crown copyright material is reproduced with the permission of the Controller
of HMSO and the Queen’s Printer for Scotland for the following under a click
user licence Figure I1; Figure I2; Table p3; Table 1.1; Figure 1.1; Figure 1.2;
Figure 1.3; Table 1.2; Figure 1.4; Figure 1.5; Figure 1.6; Table p15; Table 1.3;
Figure 1.7; Figure 1.8; Table 1.4; Figure 1.9; Table 1.12; Figure p23; Figure 1.3
p23; Table p60; Table p61b; Table p326: Table 10.22; Table 10.24; Table 7.1
Todaro’s data on birth rate, GNP, growth and inequality and table p45. Data to
analyse birthrate from Economic Development for a Developing World, 3
rd
ed,
Pearson Education (Todaro, M); ‘Cohabitation: not for long but here to stay’
from Journal of Royal Statistical Society, Series A, 163 (2), Blackwell Publishing
(Ermisch J and Francesconi M, 2000); Tab 10.26 from Real GDP per capita for
more than one hundred countries Economic Journal, Vol. 88 (350) p215–242
Blackwell Publishing (Kravis, Heston and Summers 1978); Table p197 ‘Road
accidents and darkness from some effects on accidents of changes in light con-
ditions at the beginning and end of British Summer Time’, Supplementary Report
587, Transport and Road Research Laboratory (Green H, 1980).
In some instances we have been unable to trace the owners of copyright mater-
ial, and we would appreciate any information that would enable us to do so.
SFE_A01.qxd 3/23/07 10:06 AM Page xv
SFE_A01.qxd 3/23/07 10:06 AM Page xvi
Introduction
Statistics is a subject which can be (and is) applied to every aspect of our lives.
A glance at the annual Guide to Official Statistics published by the UK Office for
National Statistics, for example, gives some idea of the range of material available.
Under the letter ‘S’, for example, one finds entries for such disparate subjects
as salaries, schools, semolina(!), shipbuilding, short-time working, spoons, and
social surveys. It seems clear that whatever subject you wish to investigate, there
are data available to illuminate your study. However, it is a sad fact that many
people do not understand the use of statistics, do not know how to draw
proper inferences (conclusions) from them, or mis-represent them. Even (espe-
cially?) politicians are not immune from this – for example, it sometimes
appears they will not be happy until all school pupils and students are above
average in ability and achievement.
The subject of statistics can usefully be divided into two parts, descriptive
statistics (covered in Chapters 1 and 10 of this book) and inferential statistics
(Chapters 4–8), which are based upon the theory of probability (Chapters 2
and 3). Descriptive statistics are used to summarise information which would
otherwise be too complex to take in, by means of techniques such as averages
and graphs. The graph shown in Figure I.1 is an example, summarising drink-
ing habits in the UK.
The graph reveals, for instance, that about 43% of men and 57% of women
drink between 1 and 10 units of alcohol per week (a unit is roughly equivalent
to one glass of wine or half a pint of beer). The graph also shows that men tend
to drink more than women (this is probably no surprise to you), with higher
proportions drinking 11–20 units and over 21 units per week. This simple
graph has summarised a vast amount of information, the consumption levels
of about 45 million adults.
Even so, it is not perfect and much information is hidden. It is not obvious
from the graph that the average consumption of men is 16 units per week, of
Two types of
statistics
Figure I.1
Alcohol consumption
in the UK
SFE_A02.qxd 3/23/07 11:55 AM Page 1
2 Introduction
women only 6 units. From the graph, you would probably have expected the
averages to be closer together. This shows that graphical and numerical sum-
mary measures can complement each other. Graphs can give a very useful
visual summary of the information but are not very precise. For example, it is
difficult to convey in words the content of a graph; you have to see it.
Numerical measures such as the average are more precise and are easier to con-
vey to others. Imagine you had data for student alcohol consumption; how do
you think this would compare to the graph? It would be easy to tell someone
whether the average is higher or lower, but comparing the graphs is difficult
without actually viewing them.
Statistical inference, the second type of statistics covered, concerns the rela-
tionship between a sample of data and the population (in the statistical sense,
not necessarily human) from which it is drawn. In particular, it asks what infer-
ences can be validly drawn about the population from the sample. Sometimes
the sample is not representative of the population (either due to bad sampling
procedures or simply due to bad luck) and does not give us a true picture of
reality.
The graph was presented as fact but it is actually based on a sample of indi-
viduals, since it would obviously be impossible to ask everyone about their
drinking habits. Does it therefore provide a true picture of drinking habits? We
can be reasonably confident that it does, for two reasons. First, the government
statisticians who collected the data designed the survey carefully, ensuring that
all age groups are fairly represented, and did not conduct all the interviews in
pubs, for example. Second, the sample is a large one (about 10 000 households)
so there is little possibility of getting an unrepresentative sample. It would be
very unlucky if the sample consisted entirely of teetotallers, for example. We
can be reasonably sure, therefore, that the graph is a fair reflection of reality
and that the average woman drinks around 6 units of alcohol per week.
However, we must remember that there is some uncertainty about this esti-
mate. Statistical inference provides the tools to measure that uncertainty.
The scatter diagram in Figure I.2 (considered in more detail in Chapter 7)
shows the relationship between economic growth and the birth rate in 12
developing countries. It illustrates a negative relationship – higher economic
growth appears to be associated with lower birth rates.
Once again we actually have a sample of data, drawn from the population
of all countries. What can we infer from the sample? Is it likely that the ‘true’
Figure I.2
Birthrate vs growth
rate
SFE_A02.qxd 3/23/07 11:55 AM Page 2
relationship (what we would observe if we had all the data) is similar, or do
we have an unrepresentative sample? In this case the sample size is quite small
and the sampling method is not known, so we might be cautious in our
conclusions.
By the time you have finished this book you will have encountered and, I
hope, mastered a range of statistical techniques. However, becoming a compe-
tent statistician is about more than learning the techniques, and comes with
time and practice. You could go on to learn about the subject at a deeper level
and learn some of the many other techniques that are available. However, I
believe you can go a long way with the simple methods you learn here, and
gain insight into a wide range of problems. A nice example of this is contained
in the article ‘Error Correction Models: Specification, Interpretation, Estima-
tion’, by G. Alogoskoufis and R. Smith in the Journal of Economic Surveys, 1991
(vol. 5, pp. 27–128), examining the relationship between wages, prices and
other variables. After 19 pages analysing the data using techniques far more
advanced than those presented in this book, they state ‘the range of statistical
techniques utilised have not provided us with anything more than we would
have got by taking the [ ] variables and looking at their graphs’. Sometimes
advanced techniques are needed, but never underestimate the power of the
humble graph.
Beyond a technical mastery of the material, being a statistician encompasses
a range of more informal skills which you should endeavour to acquire. I hope
that you will learn some of these from reading this book. For example, you
should be able to spot errors in analyses presented to you, because your statistical
‘intuition’ rings a warning bell telling you something is wrong. For example,
the Guardian newspaper, on its front page, once provided a list of the ‘best’ schools
in England, based on the fact that in each school, every one of its pupils passed
a national exam – a 100% success rate. Curiously, all of the schools were rela-
tively small, so perhaps this implies that small schools get better results than
large ones. Once you can think statistically you can spot the fallacy in this
argument. Try it. The answer is at the end of this introduction.
Here is another example. The UK Department of Health released the follow-
ing figures about health spending, showing how planned expenditure (in £m)
was to increase.
1998–99 1999–00 2000–01 2001–02 Total increase
over 3-year period
Health spending 37 169 40 228 43 129 45 985 17 835
The total increase in the final column seems implausibly large, especially
when compared to the level of spending. The increase is about 45% of the
level. This should set off the warning bell, once you have a ‘feel’ for statistics
(and, perhaps, a certain degree of cynicism about politics!). The ‘total increase’
is the result of counting the increase from 98–99 to 99–00 three times, the
increase from 99–00 to 00–01 twice, plus the increase from 00–01 to 01–02. It
therefore measures the cumulative extra resources to health care over the whole
period, but not the year-on-year increase, which is what many people would
interpret it to be.
Introduction 3
Statistics and you
SFE_A02.qxd 3/23/07 11:55 AM Page 3
4 Introduction
You will also become aware that data cannot be examined without their
context. The context might determine the methods you use to analyse
the data, or influence the manner in which the data are collected. For example,
the exchange rate and the unemployment rate are two economic variables
which behave very differently. The former can change substantially, even on a
daily basis, and its movements tend to be unpredictable. Unemployment
changes only slowly and if the level is high this month it is likely to be high
again next month. There would be little point in calculating the unemploy-
ment rate on a daily basis, yet this makes some sense for the exchange rate.
Economic theory tells us quite a lot about these variables even before we begin
to look at the data. We should therefore learn to be guided by an appropriate
theory when looking at the data – it will usually be a much more effective way
to proceed.
Another useful skill is the ability to present and explain statistical concepts
and results to others. If you really understand something you should be able to
explain it to someone else – this is often a good test of your own knowledge.
Below are two examples of a verbal explanation of the variance (covered in
Chapter 1) to illustrate.
Good explanation
The variance of a set of observations
expresses how spread out are the numbers.
A low value of the variance indicates that
the observations are of similar size, a
high value indicates that they are widely
spread around the average.
Bad explanation
The variance is a formula for the devi-
ations, which are squared and added up.
The differences are from the mean, and
divided by n or sometimes by n − 1.
The bad explanation is a failed attempt to explain the formula for the variance
and gives no insight into what it really is. The good explanation tries to convey
the meaning of the variance without worrying about the formula (which is best
written down). For a (statistically) unsophisticated audience the explanation is
quite useful and might then be supplemented by a few examples.
Statistics can also be written well or badly. Two examples follow, concerning
a confidence interval, which is explained in Chapter 4. Do not worry if you do
not understand the statistics now.
Good explanation
The 95% confidence interval is given by
X ± 1.96 ×
Inserting the sample values X = 400, s
2
=
1600 and n = 30 into the formula we
obtain
400 ± 1.96 ×
yielding the interval
[385.7, 414.3]
1600
30
s
n
2
Bad explanation
95% interval = X − 1.96 =
X + 1.96 = 0.95
= 400 − 1.96 and
= 400 + 1.96
so we have [385.7, 414.3]
1600 30/
1600 30/
sn
2
/
sn
2
/
SFE_A02.qxd 3/23/07 11:55 AM Page 4
In good statistical writing there is a logical flow to the argument, like a
written sentence. It is also concise and precise, without too much extraneous
material. The good explanation exhibits these characteristics whereas the bad
explanation is simply wrong and incomprehensible, even though the final
answer is correct. You should therefore try to note the way the statistical argu-
ments are laid out in this book, as well as take in their content.
When you do the exercises at the end of each chapter, try to get another
student to read your work through. If they cannot understand the flow or logic
of your work then you have not succeeded in presenting your work sufficiently
accurately.
A high proportion of small schools appear in the list simply because they are
lucky. Consider one school of 20 pupils, another with 1000, where the average
ability is similar in both. The large school is highly unlikely to obtain a 100%
pass rate, simply because there are so many pupils and (at least) one of them
will probably perform badly. With 20 pupils, you have a much better chance of
getting them all through. This is just a reflection of the fact that there tends to
be greater variability in smaller samples. The schools themselves, and the
pupils, are of similar quality.
Introduction 5
Answer to the
‘best’ schools
problem
SFE_A02.qxd 3/23/07 11:55 AM Page 5
SFE_A02.qxd 3/23/07 11:55 AM Page 6
Descriptive statistics
Learning outcomes 8
Introduction 8
Summarising data using graphical techniques 10
Education and employment, or, after all this, will you get a job? 10
The bar chart 10
The pie chart 13
Looking at cross-section data: wealth in the UK in 2001 15
Frequency tables and histograms 15
The histogram 17
Relative frequency and cumulative frequency distributions 19
Summarising data using numerical techniques 23
Measures of location: the mean 24
The mean as the expected value 26
The sample mean and the population mean 26
The weighted average 27
The median 28
The mode 29
Measures of dispersion 31
The variance 33
The standard deviation 34
The variance and standard deviation of a sample 35
Alternative formulae for calculating the variance and standard deviation 36
The coefficient of variation 37
The standard deviation of the logarithm 37
Measuring deviations from the mean: z scores 38
Chebyshev’s inequality 39
Measuring skewness 40
Comparison of the 2001 and 1979 distributions of wealth 40
The box and whiskers diagram 41
Time-series data: investment expenditures 1970–2002 42
Graphing multiple series 47
Numerical summary statistics 50
The mean of a time series 50
The geometric mean 51
An approximate way of obtaining the average growth rate 52
The variance of a time series 52
Graphing bivariate data: the scatter diagram 55
1
Contents
SFE_C01.qxd 3/23/07 12:15 PM Page 7
8 Chapter 1 • Descriptive statistics
Learning outcomes
Data transformations 57
Rounding 57
Grouping 58
Dividing/multiplying by a constant 58
Differencing 58
Taking logarithms 58
Taking the reciprocal 58
Deflating 59
Guidance to the student: how to measure your progress 59
Summary 59
Key terms and concepts 60
Problems 60
Reference 66
Answers to exercises 66
Appendix 1A:
ΣΣ
notation 70
Problems on Σ notation 71
Appendix 1B: E and V operators 72
Appendix 1C: Using logarithms 73
Problems on logarithms 74
By the end of this chapter you should be able to:
n recognise different types of data and use appropriate methods to summarise
and analyse them
n use graphical techniques to provide a visual summary of one or more data
series
n use numerical techniques (such as an average) to summarise data series
n recognise the strengths and limitations of such methods
n recognise the usefulness of data transformations to gain additional insight
into a set of data
Introduction
The aim of descriptive statistical methods is simple: to present information in a
clear, concise and accurate manner. The difficulty in analysing many phenom-
ena, be they economic, social or otherwise, is that there is simply too much
information for the mind to assimilate. The task of descriptive methods is
therefore to summarise all this information and draw out the main features,
without distorting the picture.
Consider, for example, the problem of presenting information about the
wealth of British citizens (which follows later in this chapter). There are about
17 million households on which data are available and to present the data in
raw form (i.e. the wealth holdings of each and every family) would be neither
useful nor informative (it would take about 30 000 pages of a book, for example).
It would be more useful to have much less information, but information which
SFE_C01.qxd 3/23/07 12:15 PM Page 8