Tải bản đầy đủ (.pdf) (530 trang)

wiley statistical analysis with excel for dummies

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (20.48 MB, 530 trang )


Statistical Analysis
with Excel
®
3RD EDITION
by Joseph Schmuller, PhD
Statistical Analysis
with Excel
®
3RD EDITION
Statistical Analysis with Excel
®
For Dummies
®
, 3rd Edition
Published by
John Wiley & Sons, Inc.
111 River Street
Hoboken, NJ 07030-5774
www.wiley.com
Copyright © 2013 by John Wiley & Sons, Inc., Hoboken, New Jersey
Published by John Wiley & Sons, Inc., Hoboken, New Jersey
Published simultaneously in Canada
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or
by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permit-
ted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written
permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the
Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600.
Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley


& Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://
www.wiley.com/go/permissions.
Trademarks: Wiley, the Wiley logo, For Dummies, the Dummies Man logo, A Reference for the Rest of Us!,
The Dummies Way, Dummies Daily, The Fun and Easy Way, Dummies.com, Making Everything Easier, and
related trade dress are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its afliates
in the United States and other countries, and may not be used without written permission. . Microsoft is a
registered trademark of Microsoft Corporation. All other trademarks are the property of their respective
owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.
LIMIT OF LIABILITY/DISCLAIMER OF WARRANTY: THE PUBLISHER AND THE AUTHOR MAKE NO
REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE ACCURACY OR COMPLETENESS OF
THE CONTENTS OF THIS WORK AND SPECIFICALLY DISCLAIM ALL WARRANTIES, INCLUDING WITH-
OUT LIMITATION WARRANTIES OF FITNESS FOR A PARTICULAR PURPOSE. NO WARRANTY MAY BE
CREATED OR EXTENDED BY SALES OR PROMOTIONAL MATERIALS. THE ADVICE AND STRATEGIES
CONTAINED HEREIN MAY NOT BE SUITABLE FOR EVERY SITUATION. THIS WORK IS SOLD WITH THE
UNDERSTANDING THAT THE PUBLISHER IS NOT ENGAGED IN RENDERING LEGAL, ACCOUNTING, OR
OTHER PROFESSIONAL SERVICES. IF PROFESSIONAL ASSISTANCE IS REQUIRED, THE SERVICES OF
A COMPETENT PROFESSIONAL PERSON SHOULD BE SOUGHT. NEITHER THE PUBLISHER NOR THE
AUTHOR SHALL BE LIABLE FOR DAMAGES ARISING HEREFROM. THE FACT THAT AN ORGANIZA-
TION OR WEBSITE IS REFERRED TO IN THIS WORK AS A CITATION AND/OR A POTENTIAL SOURCE OF
FURTHER INFORMATION DOES NOT MEAN THAT THE AUTHOR OR THE PUBLISHER ENDORSES THE
INFORMATION THE ORGANIZATION OR WEBSITE MAY PROVIDE OR RECOMMENDATIONS IT MAY
MAKE. FURTHER, READERS SHOULD BE AWARE THAT INTERNET WEBSITES LISTED IN THIS WORK
MAY HAVE CHANGED OR DISAPPEARED BETWEEN WHEN THIS WORK WAS WRITTEN AND WHEN IT
IS READ. FULFILLMENT OF EACH COUPON OFFER IS THE SOLE RESPONSIBILITY OF THE OFFEROR.
For general information on our other products and services, please contact our Customer Care
Department within the U.S. at 877-762-2974, outside the U.S. at 317-572-3993, or fax 317-572-4002.
For technical support, please visit www.wiley.com/techsupport.
Wiley publishes in a variety of print and electronic formats and by print-on-demand. Some material
included with standard print versions of this book may not be included in e-books or in print-on-demand.
If this book refers to media such as a CD or DVD that is not included in the version you purchased, you

may download this material at . For more information about Wiley
products, visit www.wiley.com.
Library of Congress Control Number: 2013932117
ISBN 978-1-118-46431-1 (pbk); ISBN 978-1-118-46432-8 (ebk); ISBN 978-1-118-46433-5 (ebk);
ISBN 978-1-118-46434-2 (ebk)
Manufactured in the United States of America
10 9 8 7 6 5 4 3 2 1
About the Author
Joseph Schmuller, PhD is a veteran of over 25 years in Information
Technology. He is the author of several books on computing, including the
three editions of Teach Yourself UML in 24 Hours (SAMS), and the two editions
of Statistical Analysis with Excel For Dummies. He has written numerous
articles on advanced technology. From 1991 through 1997, he was
Editor-in-Chief of PC AI magazine.
He is a former member of the American Statistical Association, and he has
taught statistics at the undergraduate and graduate levels. He holds a B.S.
from Brooklyn College, an M.A. from the University of Missouri-Kansas City,
and a Ph.D. from the University of Wisconsin, all in psychology. He and his
family live in Jacksonville, Florida, where he is on the faculty at the University
of North Florida.
Dedication
In loving memory of my wonderful mother, Sara Riba Schmuller, who rst
showed me how to work with numbers, and taught me the skills to write
about them.
Author’s Acknowledgments
As I said in the rst two editions, writing a For Dummies book is an incredible
amount of fun. You get to air out your ideas in a friendly, conversational way,
and you get a chance to throw in some humor, too. To write one more edition

is a wonderful trifecta. I worked again with a terric team. Acquisitions
Editor Stephanie McComb and Project Editor Beth Taylor of Wiley have been
encouraging, cooperative, and above all, patient. Dennis Short is unsurpassed
as a Technical Editor. His students at Purdue are lucky to have him. Any
errors that remain are under the sole proprietorship of the author. My
deepest thanks to Stephanie and Beth. My thanks to Waterside Productions
for representing me in this effort.
Again I thank mentors in college and graduate school who helped shape my
statistical knowledge: Mitch Grossberg (Brooklyn College); Mort Goldman, Al
Hillix, Larry Simkins, and Jerry Sheridan (University of Missouri-Kansas City);
and Cliff Gillman and John Theios (University of Wisconsin-Madison). A long
time ago at the University of Missouri-Kansas City, Mort Goldman exempted
me from a graduate statistics nal on one condition — that I learn the last
course topic, Analysis of Covariance, on my own. I hope he’s happy with
Appendix B.
I thank Kathryn as always for so much more than I can say. Finally, again a
special note of thanks to my friend Brad, who suggested this whole thing in
the rst place!
Publisher’s Acknowledgments
We’re proud of this book; please send us your comments at .
For other comments, please contact our Customer Care Department within the U.S. at 877-762-2974,
outside the U.S. at 317-572-3993, or fax 317-572-4002.
Some of the people who helped bring this book to market include the following:
Acquisitions, Editorial, and
Vertical Websites
Project Editor: Beth Taylor
Acquisitions Editor: Stephanie McComb
Copy Editor: Beth Taylor
Technical Editor: Dennis Short
Editorial Director: Robyn Siesky

Vertical Websites: Rich Graves
Editorial Assistant: Kathleen Jeffers
Cover Photo: © NAN104 / iStockphoto
Composition Services
Project Coordinator: Sheree Montgomery
Layout and Graphics: Joyce Haughey,
Christin Swinford
Proofreaders: Debbye Butler, Melissa Cossell
Indexer: Potomac Indexing, LLC
Publishing and Editorial for Technology Dummies
Richard Swadley, Vice President and Executive Group Publisher
Andy Cummings, Vice President and Publisher
Mary Bednarek, Executive Acquisitions Director
Mary C. Corder, Editorial Director
Publishing for Consumer Dummies
Kathleen Nebenhaus, Vice President and Executive Publisher
Composition Services
Debbie Stailey, Director of Composition Services
Contents at a Glance
Introduction 1
Part I: Getting Started with Statistical
Analysis with Excel 7
Chapter 1: Evaluating Data in the Real World 9
Chapter 2: Understanding Excel’s Statistical Capabilities 31
Part II: Describing Data 63
Chapter 3: Show and Tell: Graphing Data 65
Chapter 4: Finding Your Center 97
Chapter 5: Deviating from the Average 113
Chapter 6: Meeting Standards and Standings 131
Chapter 7: Summarizing It All 147

Chapter 8: What’s Normal? 173
Part III: Drawing Conclusions from Data 185
Chapter 9: The Condence Game: Estimation 187
Chapter 10: One-Sample Hypothesis Testing 203
Chapter 11: Two-Sample Hypothesis Testing 219
Chapter 12: Testing More Than Two Samples 251
Chapter 13: Slightly More Complicated Testing 279
Chapter 14: Regression: Linear and Multiple 293
Chapter 15: Correlation: The Rise and Fall of Relationships 331
Part IV: Probability 353
Chapter 16: Introducing Probability 355
Chapter 17: More on Probability 379
Chapter 18: A Career in Modeling 393
Part V: The Part of Tens 413
Chapter 19: Ten Statistical and Graphical Tips and Traps 415
Chapter 20: Ten Things (Thirteen, Actually)
That Just Didn’t Fit in Any Other Chapter 421
Appendix A: When Your Worksheet Is a Database 451
Appendix B: The Analysis of Covariance 467
Index 481
Bonus Appendix 1: When Your Data Live Elsewhere
Bonus Appendix 2: Tips for Teachers (And Learners)
Table of Contents
Introduction 1
About This Book 2
What You Can Safely Skip 2
Foolish Assumptions 2
How This Book Is Organized 3
Part I: Getting Started with Statistical Analysis with Excel 3
Part II: Describing Data 3

Part III: Drawing Conclusions from Data 3
Part IV: Probability 4
Part V: The Part of Tens 4
Appendix A: When Your Worksheet is a Database 4
Appendix B: The Analysis of Covariance 4
Bonus Appendix 1: When Your Data Live Elsewhere 5
Bonus Appendix 2: Tips for Teachers (And Learners) 5
Icons Used in This Book 5
Where to Go from Here 6
Part I: Getting Started with Statistical
Analysis with Excel 7
Chapter 1: Evaluating Data in the Real World 9
The Statistical (And Related) Notions You Just Have to Know 9
Samples and populations 10
Variables: Dependent and independent 11
Types of data 12
A little probability 13
Inferential Statistics: Testing Hypotheses 14
Null and alternative hypotheses 15
Two types of error 16
What’s New in Excel 2013? 18
What’s Old in Excel 2013? 22
Knowing the Fundamentals 24
Autolling cells 24
Referencing cells 26
What’s New in This Edition? 28
Statistical Analysis with Excel For Dummies, 3rd Edition
xiv
Chapter 2: Understanding Excel’s Statistical Capabilities 31
Getting Started 31

Setting Up for Statistics 34
Worksheet functions in Excel 2013 34
Quickly accessing statistical functions 37
Array functions 38
What’s in a name? An array of possibilities 42
Creating your own array formulas 50
Using data analysis tools 51
Accessing Commonly Used Functions 55
For Mac Users 56
The Ribbon 57
Data analysis tools 58
Part II: Describing Data 63
Chapter 3: Show and Tell: Graphing Data 65
Why Use Graphs? 65
Some Fundamentals 67
Excel’s Graphics (Chartics?) Capabilities 67
Inserting a chart 68
Becoming a Columnist 69
Stacking the columns 73
One more thing 74
Slicing the Pie 75
A word from the wise 77
Drawing the Line 77
Adding a Spark 81
Passing the Bar 83
The Plot Thickens 85
Finding Another Use for the Scatter Chart 89
Power View! 90
For Mac Users 93
Chapter 4: Finding Your Center 97

Means: The Lore of Averages 97
Calculating the mean 98
AVERAGE and AVERAGEA 99
AVERAGEIF and AVERAGEIFS 101
TRIMMEAN 104
Other means to an end 106
Medians: Caught in the Middle 108
Finding the median 108
MEDIAN 109
xv
Table of Contents
Statistics À La Mode 110
Finding the mode 110
MODE.SNGL and MODE.MULT 110
Chapter 5: Deviating from the Average 113
Measuring Variation 114
Averaging squared deviations: Variance and how to calculate it 114
VAR.P and VARPA 117
Sample variance 119
VAR.S and VARA 119
Back to the Roots: Standard Deviation 120
Population standard deviation 121
STDEV.P and STDEVPA 121
Sample standard deviation 122
STDEV.S and STDEVA 122
The missing functions: STDEVIF and STDEVIFS 123
Related Functions 127
DEVSQ 127
Average deviation 128
AVEDEV 129

Chapter 6: Meeting Standards and Standings 131
Catching Some Zs 131
Characteristics of z-scores 132
Bonds versus the Bambino 132
Exam scores 133
STANDARDIZE 134
Where Do You Stand? 136
RANK.EQ and RANK.AVG 136
LARGE and SMALL 138
PERCENTILE.INC and PERCENTILE.EXC 139
PERCENTRANK.INC and PERCENTRANK.EXC 141
Data analysis tool: Rank and Percentile 143
For Mac Users 145
Chapter 7: Summarizing It All 147
Counting Out 147
COUNT, COUNTA, COUNTBLANK, COUNTIF, COUNTIFS 147
The Long and Short of It 150
MAX, MAXA, MIN, and MINA 150
Getting Esoteric 152
SKEW and SKEW.P 152
KURT 154
Tuning In the Frequency 156
FREQUENCY 156
Data analysis tool: Histogram 158
Statistical Analysis with Excel For Dummies, 3rd Edition
xvi
Can You Give Me a Description? 160
Data analysis tool: Descriptive Statistics 160
Be Quick About It! 162
Instant Statistics 165

For Mac Users 167
Descriptive statistics 167
Histogram 169
Instant statistics 170
Chapter 8: What’s Normal? 173
Hitting the Curve 173
Digging deeper 174
Parameters of a normal distribution 175
NORM.DIST 177
NORM.INV 178
A Distinguished Member of the Family 179
NORM.S.DIST 181
NORM.S.INV 181
PHI and GAUSS 182
Part III: Drawing Conclusions from Data 185
Chapter 9: The Condence Game: Estimation 187
Understanding Sampling Distribution 187
An EXTREMELY Important Idea: The Central Limit Theorem 189
Simulating the Central Limit Theorem 190
The Limits of Condence 195
Finding condence limits for a mean 195
CONFIDENCE.NORM 198
Fit to a t 199
CONFIDENCE.T 201
Chapter 10: One-Sample Hypothesis Testing 203
Hypotheses, Tests, and Errors 203
Hypothesis tests and sampling distributions 204
Catching Some Z’s Again 207
ZTEST 209
t for One 211

T.DIST, T.DIST.RT, and T.DIST.2T 212
T.INV and T.INV.2T 213
Testing a Variance 214
CHISQ.DIST and CHISQ.DIST.RT 216
CHISQ.INV and CHISQ.INV.RT 217
xvii
Table of Contents
Chapter 11: Two-Sample Hypothesis Testing 219
Hypotheses Built for Two 219
Sampling Distributions Revisited 220
Applying the Central Limit Theorem 221
Z’s once more 223
Data analysis tool: z-Test: Two Sample for Means 224
t for Two 227
Like peas in a pod: Equal variances 227
Like p’s and q’s: Unequal variances 229
T.TEST 229
Data Analysis Tool: t-Test: Two Sample 230
A Matched Set: Hypothesis Testing for Paired Samples 234
T.TEST for matched samples 235
Data analysis tool: t-test: Paired Two Sample for Means 237
Testing Two Variances 239
Using F in conjunction with t 241
F.TEST 242
F.DIST and F.DIST.RT 244
F.INV and F.INV.RT 245
Data Analysis Tool: F-test Two Sample for Variances 246
For Mac Users 248
Chapter 12: Testing More Than Two Samples 251
Testing More Than Two 251

A thorny problem 252
A solution 253
Meaningful relationships 257
After the F-test 258
Data analysis tool: Anova: Single Factor 262
Comparing the means 263
Another Kind of Hypothesis, Another Kind of Test 265
Working with repeated measures ANOVA 266
Getting trendy 268
Data analysis tool: Anova: Two Factor Without Replication 271
Analyzing trend 273
For Mac Users 275
Single Factor Analysis of Variance 275
Repeated Measures 276
Chapter 13: Slightly More Complicated Testing 279
Cracking the Combinations 279
Breaking down the variances 280
Data analysis tool: Anova: Two-Factor Without Replication 281
Statistical Analysis with Excel For Dummies, 3rd Edition
xviii
Cracking the Combinations Again 284
Rows and columns 284
Interactions 285
The analysis 285
Data analysis tool: Anova: Two-Factor With Replication 287
For Mac Users 290
Chapter 14: Regression: Linear and Multiple 293
The Plot of Scatter 293
Graphing Lines 295
Regression: What a Line! 297

Using regression for forecasting 299
Variation around the regression line 299
Testing hypotheses about regression 301
Worksheet Functions for Regression 307
SLOPE, INTERCEPT, STEYX 307
FORECAST 309
Array function: TREND 309
Array function: LINEST 313
Data Analysis Tool: Regression 315
Tabled output 317
Graphic output 319
Juggling Many Relationships at Once: Multiple Regression 320
Excel Tools for Multiple Regression 321
TREND revisited 321
LINEST revisited 322
Regression data analysis tool revisited 325
For Mac Users 327
Chapter 15: Correlation: The Rise and Fall of Relationships 331
Scatterplots Again 331
Understanding Correlation 332
Correlation and Regression 334
Testing Hypotheses About Correlation 338
Is a correlation coefcient greater than zero? 338
Do two correlation coefcients differ? 339
Worksheet Functions for Correlation 340
CORREL and PEARSON 341
RSQ 342
COVARIANCE.P and COVARIANCE.S 343
Data Analysis Tool: Correlation 343
Tabled output 345

Data Analysis Tool: Covariance 348
Testing Hypotheses About Correlation 349
Worksheet Functions: FISHER, FISHERINV 349
For Mac Users 350
xix
Table of Contents
Part IV: Probability 353
Chapter 16: Introducing Probability 355
What Is Probability? 355
Experiments, trials, events, and sample spaces 356
Sample spaces and probability 356
Compound Events 357
Union and intersection 357
Intersection again 358
Conditional Probability 359
Working with the probabilities 360
The foundation of hypothesis testing 360
Large Sample Spaces 361
Permutations 362
Combinations 362
Worksheet Functions 363
FACT 363
PERMUT and PERMUTIONA 364
COMBIN and COMBINA 365
Random Variables: Discrete and Continuous 365
Probability Distributions and Density Functions 366
The Binomial Distribution 368
Worksheet Functions 369
BINOM.DIST and BINOM.DIST.RANGE 370
NEGBINOM.DIST 372

Hypothesis Testing with the Binomial Distribution 373
BINOM.INV 374
More on hypothesis testing 375
The Hypergeometric Distribution 376
HYPGEOM.DIST 377
Chapter 17: More on Probability 379
Discovering Beta 379
BETA.DIST 381
BETA.INV 383
Poisson 384
POISSON.DIST 385
Working with Gamma 387
The Gamma function and GAMMA 387
The Gamma Distribution and GAMMA.DIST 388
GAMMA.INV 390
Exponential 391
EXPON.DIST 391
Statistical Analysis with Excel For Dummies, 3rd Edition
xx
Chapter 18: A Career in Modeling 393
Modeling a Distribution 393
Plunging into the Poisson distribution 394
Using POISSON.DIST 396
Testing the model’s t 396
A word about CHISQ.TEST 399
Playing ball with a model 400
A Simulating Discussion 402
Taking a chance: The Monte Carlo method 403
Loading the dice 403
Simulating the Central Limit Theorem 407

For Mac Users 410
Part V: The Part of Tens 413
Chapter 19: Ten Statistical and Graphical Tips and Traps 415
Signicant Doesn’t Always Mean Important 415
Trying to Not Reject a Null Hypothesis
Has a Number of Implications 416
Regression Isn’t Always Linear 416
Extrapolating Beyond a Sample Scatterplot Is a Bad Idea 417
Examine the Variability Around a Regression Line 417
A Sample Can Be Too Large 417
Consumers: Know Your Axes 418
Graphing a Categorical Variable as Though It’s a
Quantitative Variable Is Just Wrong 418
Whenever Appropriate, Include Variability in Your Graph 419
Be Careful When Relating Statistics Textbook Concepts to Excel 420
Chapter 20: Ten Things (Thirteen, Actually)
That Just Didn’t Fit in Any Other Chapter 421
Forecasting Techniques 421
A moving experience 422
How to be a smoothie, exponentially 424
Graphing the Standard Error of the Mean 425
Probabilities and Distributions 429
PROB 429
WEIBULL.DIST 429
Drawing Samples 430
Testing Independence: The True Use of CHISQ.TEST 431
Logarithmica Esoterica 434
What is a logarithm? 434
What is e? 436
LOGNORM.DIST 439

LOGNORM.INV 440
xxi
Table of Contents
Array Function: LOGEST 441
Array Function: GROWTH
445
The Logs of Gamma
448
Sorting Data
449
For Mac Users
450
Appendix A: When Your Worksheet Is a Database 451
Introducing Excel Databases 451
The Satellites database
452
The criteria range
453
The format of a database function
454
Counting and Retrieving
455
DCOUNT and DCOUNTA
455
DGET
456
Arithmetic
457
DMAX and DMIN 457
DSUM

457
DPRODUCT
458
Statistics
458
DAVERAGE
458
DVAR and DVARP 458
DSTDEV and DSTDEVP
459
According to Form
459
Pivot Tables
461
Appendix B: The Analysis of Covariance 467
Covariance: A Closer Look 467
Why You Analyze Covariance
468
How You Analyze Covariance
469
ANCOVA in Excel
470
Method 1: ANOVA
471
Method 2: Regression
475
After the ANCOVA
478
And One More Thing
479

Index 481
Bonus Appendix 1: When Your Data Live Elsewhere
Bonus Appendix 2: Tips for Teachers (And Learners)
Statistical Analysis with Excel For Dummies, 3rd Edition
xxii
Introduction
W

hat? Yet another statistics book? Well . . . this is a statistics book, all
right, but in my humble (and thoroughly biased) opinion, it’s not just
another statistics book.
What? Yet another Excel book? Same thoroughly biased opinion — it’s not
just another Excel book. What? Yet another edition of a book that’s not just
another statistics book and not just another Excel book? Well . . . yes. You got
me there.
So here’s the deal — for the previous two editions and for this one. Many
statistics books teach you the concepts but don’t give you a way to apply
them. That often leads to a lack of understanding. With Excel, you have a
ready-made package for applying statistics concepts.
Looking at it from the opposite direction, many Excel books show you Excel’s
capabilities but don’t tell you about the concepts behind them. Before I tell
you about an Excel statistical tool, I give you the statistical foundation it’s
based on. That way, you understand the tool when you use it — and you use
it more effectively.
I didn’t want to write a book that’s just “select this menu” and “click this
button.” Some of that is necessary, of course, in any book that shows you
how to use a software package. My goal was to go way beyond that.
I also didn’t want to write a statistics “cookbook”: When-faced-with-problem-
#310-use-statistical-procedure-#214. My goal was to go way beyond that, too.
Bottom line: This book isn’t just about statistics or just about Excel — it

sits firmly at the intersection of the two. In the course of telling you about
statistics, I cover every Excel statistical feature. (Well . . . almost. I left one
out. I left it out of the first two editions, too. It’s called “Fourier Analysis.” All
the necessary math to understand it would take a whole book, and you might
never use this tool, anyway.)

×