Stephen L. Nelson is an author and CPA who provides accounting,
business advisory, tax planning, and tax preparation services to small
businesses. He is the author of more than 100 books, including
QuickBooks For Dummies and Quicken For Dummies.
Cover Image: ©iStockphoto.com/Henrik5000
Visit the companion website at www.dummies.com/
extras/exceldataanalysis to find sample spreadsheets
from the examples used throughout the book.
Go to Dummies.com
®
for videos, step-by-step examples,
how-to articles, or to shop!
Open the book and find:
•How to make the most of Excel
to analyze data
•Insight into the info you’re
already working with
•No-sweat descriptions of how
to get things done
•Guidance on creating
PivotTables and PivotCharts
•Easy explanations of
Excel add-ons
•Useful data analysis tips
and facts
•A handy glossary of terms
•Fancier tools for those who
have mastered the basics
$26.99 USA / $31.99 CAN / £17.99 UK
9 781118 898093
52699
ISBN:978-1-118-89809-3
Computers/Desktop Applications/Spreadsheets
Want to analyze data?
Let Excel do the heavy lifting!
If you’re like most people, you probably don’t take full
advantage of Excel’s data analysis tools. This friendly guide
walks you through the features of Excel to help you discover
the insights in your rough data. From input, to analysis,
to visualization, this book shows you how to use Excel to
uncover what’s hidden within the numbers.
•Thebuckstopshere—getstraightforwardguidanceonhowto
putinformationintoExcelworkbookssothatyoucanbeginto
analyzeit
•Yououghtaknow—findmust-know,usefultidbitsabout
statistics,analyzingdata,andvisuallypresentingdatathatwill
makesenseofthestuffyou’rediggingfor
•Beholdthepower—discoverthemostpowerfuldataanalysis
toolsthatExcelprovides,itscross-tabulationcapabilities
PivotTableandPivotChart
•You’reanalyzingdata,darling—gettoknowthemore
sophisticatedtoolsprovidedbyExceldataanalysisadd-ons,
liket-test,z-test,scatterplot,regression,ANOVA,andFourier
Excel
®
DataAnalysis
Nelson
Nelson
2nd Edition
Stephen L. Nelson
Author of QuickBooks For Dummies®
E. C. Nelson
Learnto:
• Navigate and analyze data
• Work with external databases,
PivotTables, and PivotCharts
• Use Excel for statistical and financial
functions
• Make the most of the latest features
of Excel 2013
Excel
®
DataAnalysis
2nd
Edition
Making
E
verything
Easier!
™
www.it-ebooks.info
Start with FREE Cheat Sheets
Cheat Sheets include
•Checklists
•Charts
•CommonInstructions
•AndOtherGoodStuff!
Get Smart at Dummies.com
Dummies.com makes your life easier with 1,000s
of answers on everything from removing wallpaper
to using the latest version of Windows.
Check out our
•Videos
•IllustratedArticles
•Step-by-StepInstructions
Plus, each month you can win valuable prizes by entering
our Dummies.com sweepstakes. *
Want a weekly dose of Dummies? Sign up for Newsletters on
•DigitalPhotography
•MicrosoftWindows&Office
•PersonalFinance&Investing
•Health&Wellness
•Computing,iPods&CellPhones
•eBay
•Internet
•Food,Home&Garden
Find out “HOW” at Dummies.com
*Sweepstakes not currently available in all countries; visit Dummies.com for official rules.
Get More and Do More at Dummies.com
®
To access the Cheat Sheet created specifically for this book, go to
www.dummies.com/cheatsheet/exceldataanalysis
www.it-ebooks.info
Excel
®
Data
Analysis
2nd Edition
by Stephen L. Nelson, MBA, CPA
and E. C. Nelson
www.it-ebooks.info
Excel
®
Data Analysis For Dummies,
®
2nd Edition
Published by: John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030-5774, www.wiley.com
Copyright © 2014 by John Wiley & Sons, Inc., Hoboken, New Jersey
Published simultaneously in Canada
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by
any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted
under Sections 107 or 108 of the 1976 United States Copyright Act, without the prior written permission of the
Publisher. Requests to the Publisher for permission should be addressed to the Permissions Department,
John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at
/>Trademarks: Wiley, For Dummies, the Dummies Man logo, Dummies.com, Making Everything Easier, and
related trade dress are trademarks or registered trademarks of John Wiley & Sons, Inc. and may not be
used without written permission. Excel is a registered trademark of Microsoft Corporation. All other
trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with
any product or vendor mentioned in this book.
LIMIT OF LIABILITY/DISCLAIMER OF WARRANTY: THE PUBLISHER AND THE AUTHOR MAKE NO
REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE ACCURACY OR COMPLETENESS
OF THE CONTENTS OF THIS WORK AND SPECIFICALLY DISCLAIM ALL WARRANTIES, INCLUDING
WITHOUT LIMITATION WARRANTIES OF FITNESS FOR A PARTICULAR PURPOSE. NO WARRANTY
MAY BE CREATED OR EXTENDED BY SALES OR PROMOTIONAL MATERIALS. THE ADVICE AND
STRATEGIES CONTAINED HEREIN MAY NOT BE SUITABLE FOR EVERY SITUATION. THIS WORK IS
SOLD WITH THE UNDERSTANDING THAT THE PUBLISHER IS NOT ENGAGED IN RENDERING LEGAL,
ACCOUNTING, OR OTHER PROFESSIONAL SERVICES. IF PROFESSIONAL ASSISTANCE IS REQUIRED,
THE SERVICES OF A COMPETENT PROFESSIONAL PERSON SHOULD BE SOUGHT. NEITHER THE
PUBLISHER NOR THE AUTHOR SHALL BE LIABLE FOR DAMAGES ARISING HEREFROM. THE FACT
THAT AN ORGANIZATION OR WEBSITE IS REFERRED TO IN THIS WORK AS A CITATION AND/OR
A POTENTIAL SOURCE OF FURTHER INFORMATION DOES NOT MEAN THAT THE AUTHOR OR THE
PUBLISHER ENDORSES THE INFORMATION THE ORGANIZATION OR WEBSITE MAY PROVIDE OR
RECOMMENDATIONS IT MAY MAKE. FURTHER, READERS SHOULD BE AWARE THAT INTERNET
WEBSITES LISTED IN THIS WORK MAY HAVE CHANGED OR DISAPPEARED BETWEEN WHEN THIS
WORK WAS WRITTEN AND WHEN IT IS READ.
For general information on our other products and services, please contact our Customer Care Department
within the U.S. at 877-762-2974, outside the U.S. at 317-572-3993, or fax 317-572-4002. For technical support,
please visit www.wiley.com/techsupport.
Wiley publishes in a variety of print and electronic formats and by print-on-demand. Some material
included with standard print versions of this book may not be included in e-books or in print-on-demand. If
this book refers to media such as a CD or DVD that is not included in the version you purchased, you may
download this material at . For more information about Wiley prod-
ucts, visit www.wiley.com.
Library of Congress Control Number: 2013957980
ISBN 978-1-118-89809-3 (pbk); ISBN 978-1-118-89808-6 (ebk); ISBN 978-1-118-89810-9 (ebk)
Manufactured in the United States of America
10 9 8 7 6 5 4 3 2 1
www.it-ebooks.info
Contents at a Glance
Introduction 1
Part I: Where’s the Beef? 7
Chapter 1: Introducing Excel Tables 9
Chapter 2: Grabbing Data from External Sources
31
Chapter 3: Scrub-a-Dub-Dub: Cleaning Data
57
Part II: PivotTables and PivotCharts 79
Chapter 4: Working with PivotTables 81
Chapter 5: Building PivotTable Formulas
107
Chapter 6: Working with PivotCharts
127
Chapter 7: Customizing PivotCharts
141
Part III: Advanced Tools 155
Chapter 8: Using the Database Functions 157
Chapter 9: Using the Statistics Functions
177
Chapter 10: Descriptive Statistics 225
Chapter 11: Inferential Statistics
245
Chapter 12: Optimization Modeling with Solver
263
Part IV: The Part of Tens 287
Chapter 13: Ten Things You Ought to Know about Statistics 289
Chapter 14: Almost Ten Tips for Presenting Table
Results and Analyzing Data
301
Chapter 15: Ten Tips for Visually Analyzing and Presenting Data
307
Appendix: Glossary of Data Analysis
andExcel Terms 319
Index 329
www.it-ebooks.info
www.it-ebooks.info
Table of Contents
Introduction 1
About This Book 1
What You Can Safely Ignore
1
What You Shouldn’t Ignore (Unless You’re a Masochist)
2
Foolish Assumptions
3
How This Book Is Organized
3
Part I: Where’s theBeef?
3
Part II: PivotTables and PivotCharts
3
Part III: Advanced Tools
3
Part IV: The Part ofTens
4
Icons Used inThis Book
4
Beyond theBook
5
Where toGo fromHere
5
Part I: Where’s the Beef? 7
Chapter 1: Introducing Excel Tables 9
What Is a Table and Why Do I Care? 9
Building Tables
12
Exporting froma database
12
Building a table thehard way 12
Building a table thesemi-hard way
12
Analyzing Table Information
16
Simple statistics
16
Sorting table records
18
Using AutoFilter ona table
21
Undoing a filter
23
Turning offfilter
23
Using thecustom AutoFilter
23
Filtering a filtered table
25
Using advanced filtering
26
Chapter 2: Grabbing Data from External Sources 31
Getting Data theExport-Import Way 31
Exporting: The first step
32
Importing: The second step (if necessary)
37
Querying External Databases and Web Page Tables
44
Running a web query
45
Importing a database table
47
Querying anexternal database
49
It’s Sometimes a Raw Deal
55
www.it-ebooks.info
Excel Data Analysis For Dummies, 2nd Edition
vi
Chapter 3: Scrub-a-Dub-Dub: Cleaning Data 57
Editing Your Imported Workbook 57
Delete unnecessary columns
58
Delete unnecessary rows
58
Resize columns 58
Resize rows
60
Erase unneeded cell contents
61
Format numeric values
61
Copying worksheet data
62
Moving worksheet data
62
Replacing data infields
62
Cleaning Data withText Functions
63
What’s thebig deal, Steve?
63
The answer tosome ofyour problems
64
The CLEAN function
65
The CONCATENATE function
65
The EXACT function
66
The FIND function
67
The FIXED function
67
The LEFT function 68
The LEN function
68
The LOWER function
68
The MID function
69
The PROPER function
69
The REPLACE function
70
The REPT function
70
The RIGHT function
70
The SEARCH function
71
The SUBSTITUTE function
71
The T function
72
The TEXT function
72
The TRIM function
73
The UPPER function
73
The VALUE function
73
Converting text function formulas totext
74
Using Validation toKeep Data Clean
74
Part II: PivotTables and PivotCharts 79
Chapter 4: Working with PivotTables 81
Looking atData fromMany Angles 81
Getting Ready toPivot
82
Running thePivotTable Wizard
83
Fooling Around withYour Pivot Table
87
www.it-ebooks.info
vii
Table of Contents
Pivoting and re-pivoting 88
Filtering pivot table data
89
Refreshing pivot table data
91
Sorting pivot table data
92
Pseudo-sorting
94
Grouping and ungrouping data items
94
Selecting this, selecting that
96
Where did that cell’s number come from?
96
Setting value field settings
97
Customizing How Pivot Tables Work and Look
99
Setting pivot table options
99
Formatting pivot table information
103
Chapter 5: Building PivotTable Formulas 107
Adding Another Standard Calculation 107
Creating Custom Calculations
111
Using Calculated Fields and Items
115
Adding a calculated field 115
Adding a calculated item
117
Removing calculated fields and items
120
Reviewing calculated field and calculated item formulas
121
Reviewing and changing solve order
122
Retrieving Data froma Pivot Table
123
Getting all thevalues ina pivot table
123
Getting a value froma pivot table
124
Arguments ofthe GETPIVOTDATA function
126
Chapter 6: Working with PivotCharts 127
Why Use a Pivot Chart? 127
Getting Ready toPivot
128
Running thePivotTable Wizard
129
Fooling Around withYour Pivot Chart
133
Pivoting and re-pivoting
134
Filtering pivot chart data
134
Refreshing pivot chart data
137
Grouping and ungrouping data items
138
Using Chart Commands toCreate Pivot Charts
139
Chapter 7: Customizing PivotCharts 141
Selecting a Chart Type 141
Working withChart Styles
142
Changing Chart Layout
143
Chart and axis titles
143
Chart legend
145
Chart data labels
145
www.it-ebooks.info
Excel Data Analysis For Dummies, 2nd Edition
viii
Chart data tables 147
Chart axes
149
Chart gridlines
150
Changing a Chart’s Location
150
Formatting thePlot Area
152
Formatting theChart Area
152
Chart fill patterns
153
Chart area fonts
153
Formatting 3-D Charts
154
Formatting thewalls ofa 3-D chart 154
Using the3-D View command
154
Part III: Advanced Tools 155
Chapter 8: Using the Database Functions 157
Quickly Reviewing Functions 157
Understanding function syntax rules
158
Entering a function manually
158
Entering a function withthe Function command
159
Using theDAVERAGE Function
163
Using theDCOUNT and DCOUNTA Functions
166
Using theDGET Function
168
Using theDMAX and DMAX Functions
169
Using theDPRODUCT Function
170
Using theDSTDEV and DSTDEVP Functions
171
Using theDSUM Function
173
Using theDVAR and DVARP Functions
174
Chapter 9: Using the Statistics Functions 177
Counting Items ina Data Set 177
COUNT: Counting cells withvalues
178
COUNTA: Alternative counting cells withvalues
179
COUNTBLANK: Counting empty cells
179
COUNTIF: Counting cells that match criteria
179
PERMUT: Counting permutations
180
COMBIN: Counting combinations
180
Means, Modes, and Medians
181
AVEDEV: An average absolute deviation
181
AVERAGE: Average
182
AVERAGEA: An alternate average
182
TRIMMEAN: Trimming toa mean
183
MEDIAN: Median value
183
MODE: Mode value
184
GEOMEAN: Geometric mean
184
HARMEAN: Harmonic mean 184
www.it-ebooks.info
ix
Table of Contents
Finding Values, Ranks, and Percentiles 185
MAX: Maximum value
185
MAXA: Alternate maximum value
185
MIN: Minimum value
185
MINA: Alternate minimum value
186
LARGE: Finding the kth largest value
186
SMALL: Finding the kth smallest value
186
RANK: Ranking anarray value 187
PERCENTRANK: Finding a percentile ranking
188
PERCENTILE: Finding a percentile ranking
189
FREQUENCY: Frequency of values ina range
189
PROB: Probability ofvalues 190
Standard Deviations and Variances
192
STDEV: Standard deviation ofa sample
193
STDEVA: Alternate standard deviation ofa sample
193
STDEVP: Standard deviation ofa population
194
STDEVPA: Alternate standard deviation ofa population
194
VAR: Variance ofa sample
194
VARA: Alternate variance ofa sample
195
VARP: Variance ofa population
195
VARPA: Alternate variance ofa population
196
COVARIANCE.P and COVARIANCE.S: Covariances
196
DEVSQ: Sum ofthe squared deviations
196
Normal Distributions
197
NORM.DIST: Probability X falls ator belowa givenvalue
197
NORM.INV: X that gives specified probability
198
NORM.S.DIST: Probability variable within
z-standard deviations
198
NORM.S.INV: z-value equivalent toa probability
199
STANDARDIZE: z-value fora specified value
199
CONFIDENCE: Confidence interval fora population mean
200
KURT: Kurtosis
201
SKEW and SKEW.P: Skewness ofa distribution
201
t-distributions
202
T.DIST: Left-tail Student t-distribution
202
T.DIST.RT: Right-tail Student t-distribution
203
T.DIST.2T: Two-tail Student t-distribution
203
T.INV: Left-tailed Inverse of Student t-distribution
204
T.INV.2T: Two-tailed Inverse ofStudent t-distribution
204
T.TEST: Probability two samples fromsame population
204
f-distributions
205
F.DIST: Left-tailed f-distribution probability
205
F.DIST.RT: Right-tailed f-distribution probability
206
F.INV:Left-tailed f-value given f-distribution probability 206
F.INV.RT:Right-tailed f-value given
f-distribution probability
207
F.TEST: Probability data set variances not different 207
www.it-ebooks.info
Excel Data Analysis For Dummies, 2nd Edition
x
Binomial Distributions 207
BINOM.DIST: Binomial probability distribution
208
BINOM.INV: Binomial probability distribution
208
BINOM.DIST.RANGE: Binomial probability ofTrial Result
209
NEGBINOM.DIST: Negative binominal distribution
210
CRITBINOM: Cumulative binomial distribution
210
HYPGEOM.DIST: Hypergeometric distribution
211
Chi-Square Distributions
211
CHISQ.DIST.RT: Chi-square distribution
212
CHISQ.DIST: Chi-square distribution
213
CHISQ.INV.RT: Right-tailed chi-square
distribution probability
213
CHISQ.INV: Left-tailed chi-square distribution probability
214
CHISQ.TEST: Chi-square test
214
Regression Analysis
215
FORECAST: Forecast dependent variables using a best-fit line
215
INTERCEPT: y-axis intercept ofa line 216
LINEST
216
SLOPE: Slope ofa regression line
216
STEYX: Standard error
217
TREND
217
LOGEST: Exponential regression
217
GROWTH: Exponential growth
217
Correlation
218
CORREL: Correlation coefficient
218
PEARSON: Pearson correlation coefficient
218
RSQ: r-squared value fora Pearson correlation coefficient
218
FISHER
219
FISHERINV
219
Some Really Esoteric Probability Distributions
219
BETA.DIST: Cumulative beta probability density
219
BETA.INV: Inverse cumulative beta probability density
220
EXPON.DIST: Exponential probability distribution
220
GAMMA.DIST: Gamma distribution probability
221
GAMMAINV: X fora givengamma distribution probability
222
GAMMALN: Natural logarithm ofa gamma distribution
222
LOGNORMDIST: Probability oflognormal distribution
222
LOGINV: Value associated withlognormal distribution
probability
222
POISSON: Poisson distribution probabilities
223
WEIBULL: Weibull distribution
223
ZTEST: Probability ofa z-test
224
Chapter 10: Descriptive Statistics 225
Using theDescriptive Statistics Tool 226
Creating a Histogram
230
Ranking byPercentile
233
Calculating Moving Averages
235
www.it-ebooks.info
xi
Table of Contents
Exponential Smoothing 237
Generating Random Numbers
239
Sampling Data
241
Chapter 11: Inferential Statistics 245
Using thet-test Data Analysis Tool 246
Performing z-test Calculations
249
Creating a Scatter Plot
251
Using theRegression Data Analysis Tool
254
Using theCorrelation Analysis Tool
257
Using theCovariance Analysis Tool
258
Using theANOVA Data Analysis Tools
260
Creating anf-test Analysis
261
Using Fourier Analysis
262
Chapter 12: Optimization Modeling with Solver 263
Understanding Optimization Modeling 263
Optimizing your imaginary profits
264
Recognizing constraints
264
Setting Up a Solver Worksheet
265
Solving anOptimization Modeling Problem
268
Reviewing theSolver Reports
273
The Answer Report
273
The Sensitivity Report
275
The Limits Report
276
Some other notes aboutSolver reports
277
Working withthe Solver Options
277
Using theAll Methods options
278
Using theGRG Nonlinear tab
279
Using theEvolutionary tab
281
Saving and reusing model information
282
Understanding theSolver Error Messages
282
Solver has found a solution
283
Solver has converged to the current solution
283
Solver cannot improve thecurrent solution
283
Stop chosen when maximum time limit was reached
283
Solver stopped atuser’s request
284
Stop chosen when maximum iteration limit was reached 284
Objective Cell values do not converge 284
Solver could not find a feasible solution
284
Linearity conditions required bythis LP Solver
are not satisfied
285
The problem is too large forSolver tohandle
285
Solver encountered anerror value ina target
or constraint cell
285
There is not enough memory available to
solve theproblem
286
Error inmodel. Please verify that all cells and
constraints are valid
286
www.it-ebooks.info
Excel Data Analysis For Dummies, 2nd Edition
xii
Part IV: The Part of Tens 287
Chapter 13: Ten Things You Ought to Know about Statistics 289
Descriptive Statistics Are Straightforward 290
Averages Aren’t So Simple Sometimes
290
Standard Deviations Describe Dispersion
291
An Observation Is anObservation
292
A Sample Is a Subset ofValues
293
Inferential Statistics Are Cool butComplicated
293
Probability Distribution Functions Aren’t Always Confusing
294
Uniform distribution
294
Normal distribution
295
Parameters Aren’t So Complicated
296
Skewness and Kurtosis Describe a Probability Distribution’s Shape
297
Confidence Intervals Seem Complicated atFirst, butAre Useful
297
Chapter 14: Almost Ten Tips for Presenting Table Results and
Analyzing Data
301
Work Hard toImport Data 301
Design Information Systems toProduceRich Data
302
Don’t Forget aboutThird-Party Sources
303
Just Add It
303
Always Explore Descriptive Statistics
304
Watch forTrends
304
Slicing and Dicing: Cross-Tabulation
305
Chart It, Baby
305
Be Aware ofInferential Statistics
305
Chapter 15: Ten Tips for Visually Analyzing and
Presenting Data
307
Using theRight Chart Type 307
Using Your Chart Message asthe Chart Title
309
Beware ofPie Charts
310
Consider Using Pivot Charts forSmall Data Sets
310
Avoiding 3-D Charts
312
Never Use 3-D Pie Charts
313
Be Aware ofthe Phantom Data Markers
314
Use Logarithmic Scaling
315
Don’t Forget toExperiment
317
Get Tufte
317
Appendix: Glossary of Data Analysis andExcel Terms 319
Index
329
www.it-ebooks.info
Introduction
S
o here’s a funny deal: You know how to use Excel. You know how to
create simple workbooks and how to print stuff. And you can even, with
just a little bit of fiddling, create cool-looking charts.
But I bet that you sometimes wish that you could do more with Excel. You
sometimes wish, I wager, that you could use Excel to really gain insights into
the information, the data, that you work with in your job.
Using Excel for data analysis is what this book is all about. This book
assumes that you want to use Excel to learn new stuff, discover new secrets,
and gain new insights into the information that you’re already working with in
Excel—or the information stored electronically in some other format, such
as in your accounting system or from your web server’s analytics.
About This Book
This book isn’t meant to be read cover to cover like a Dan Brown page-turner.
Rather, it’s organized into tiny, no-sweat descriptions of how to do the things
that must be done. Hop around and read the chapters that interest you.
If you’re the sort of person who, perhaps because of a compulsive bent,
needs to read a book cover to cover, that’s fine. I recommend that you delve
in to the chapters on inferential statistics, however, only if you’ve taken at
least a couple of college-level statistics classes. But that caveat aside, feel
free. After all, maybe Dancing with the Stars is a rerun tonight.
What You Can Safely Ignore
This book provides a lot of information. That’s the nature of a how-to refer-
ence. So I want to tell you that it’s pretty darn safe for you to blow off some
chunks of the book.
For example, in many places throughout the book I provide step-by-step
descriptions of the task. When I do so, I always start each step with a
www.it-ebooks.info
2
Excel Data Analysis For Dummies, 2nd Edition
bold-faced description of what the step entails. Underneath that bold-faced
step description, I provide detailed information about what happens after
you perform that action. Sometimes I also offer help with the mechanics of
the step, like this:
1. Press Enter.
Find the key that’s labeled Enter. Extend your index finger so that it rests
ever so gently on the Enter key. Then, in one sure, fluid motion, press
the key by using your index finger. Then release the key.
Okay, that’s kind of an extreme example. I never actually go into that much
detail. My editor won’t let me. But you get the idea. If you know how to press
Enter, you can just do that and not read further. If you need help—say with
the finger-depression part or the finding-the-right-key part—you can read
the nitty-gritty details.
You can also skip the paragraphs flagged with the Technical Stuff icon. These
icons flag information that’s sort of tangential, sort of esoteric, or sort of
questionable in value...at least for the average reader. If you’re really inter-
ested in digging into the meat of the subject being discussed, go ahead and
read ’em. If you’re really just trying to get through your work so that you can
get home and watch TV with your kids, skip ’em.
I might as well also say that you don’t have to read the information provided
in the paragraphs marked with a Tip icon, either. I assume that you want to
know an easier way to do something. But if you like to do things the hard way
because that improves your character and makes you tougher, go ahead and
skip the Tip icons.
What You Shouldn’t Ignore (Unless
You’re a Masochist)
By the way, don’t skip the Warning icons. They’re the text flagged with a
picture of a 19th century bomb. They describe some things that you really
shouldn’t do.
Out of respect for you, I don’t put stuff in these paragraphs such as, “Don’t
smoke.” I figure that you’re an adult. You get to make your own lifestyle
decisions.
I reserve these warnings for more urgent and immediate dangers—things
that you can but shouldn’t do. For example: “Don’t smoke while filling your
car with gasoline.”
www.it-ebooks.info
3
Introduction
Foolish Assumptions
I assume just three things about you:
✓ You have a PC with a recent version of Microsoft Excel 2007 installed.
✓ You know the basics of working with your PC and Microsoft Windows.
✓ You know the basics of working with Excel, including how to start and
stop Excel, how to save and open Excel workbooks, and how to enter
text and values and formulas into worksheet cells.
How This Book Is Organized
This book is organized into five parts:
Part I: Where’s theBeef?
In Part I, I discuss how you get data into Excel workbooks so that you can
begin to analyze it. This is important stuff, but fortunately most of it is pretty
straightforward. If you’re new to data analysis and not all that fluent yet in
working with Excel, you definitely want to begin in Part I.
Part II: PivotTables and PivotCharts
In the second part of this book, I cover what are perhaps the most powerful
data analysis tools that Excel provides: its cross-tabulation capabilities using
the PivotTable and PivotChart commands.
No kidding, I don’t think any Excel data analysis skill is more useful than
knowing how to create pivot tables and pivot charts. If I could, I would give
you some sort of guarantee that the time you spent reading how to use these
tools is always worth the investment you make. Unfortunately, after consulta-
tion with my attorney, I find that this is impossible to do.
Part III: Advanced Tools
In Part III, I discuss some of the more sophisticated tools that Excel sup-
plies for doing data analysis. Some of these tools are always available in
Excel, such as the statistical functions. (I use a couple of chapters to cover
these.) Some of the tools come in the form of Excel add-ins, such as the Data
Analysis and the Solver add-ins.
www.it-ebooks.info
4
Excel Data Analysis For Dummies, 2nd Edition
I don’t think that these tools are going to be of interest to most readers of this
book. But if you already know how to do all the basic stuff and you have some
good statistical and quantitative methods, training, or experience, you ought
to peruse these chapters. Some really useful whistles and bells are available to
advanced users of Excel. And it would be a shame if you didn’t at least know
what they are and the basic steps that you need to take to use them.
Part IV: The Part ofTens
In my mind, perhaps the most clever element that Dan Gookin, the author of
the original and first For Dummies book, DOS For Dummies, came up with is
the part with chapters that just list information in David Letterman-ish fashion.
These chapters let us authors list useful tidbits, tips, and factoids for you.
Excel Data Analysis For Dummies, Second Edition includes three such chap-
ters. In the first, I provide some basic facts most everybody should know
about statistics and statistical analysis. In the second, I suggest ten tips for
successfully and effectively analyzing data in Excel. Finally, in the third chap-
ter, I try to make some useful suggestions about how you can visually analyze
information and visually present data analysis results.
The Part of Tens chapters aren’t technical. They aren’t complicated. They’re
very basic. You should be able to skim the information provided in these
chapters and come away with at least a few nuggets of useful information.
The appendix contains a handy glossary of terms you should understand
when working with data in general and Excel specifically. From kurtosis to his-
tograms, these sometimes baffling terms are defined here.
Icons Used inThis Book
Like other For Dummies books, this book uses icons, or little margin pictures,
to flag things that don’t quite fit into the flow of the chapter discussion. Here
are the icons that I use:
Technical Stuff: This icon points out some dirty technical details that you
might want to skip.
Tip: This icon points out a shortcut to make your life easier or more fulfilling.
www.it-ebooks.info
5
Introduction
Remember: This icon points out things that you should, well, remember.
Warning: This icon is a friendly but forceful reminder not to do some-
thing...or else.
Excel2007/2010: This icon indicates specialized instructions you should pay
attention to if you’re using one of those versions of Excel.
Beyond theBook
✓ Cheat Sheet: This book’s Cheat Sheet can be found online at www.
dummies.com/cheatsheet/exceldataanalysis. See the Cheat
Sheet for info on Excel database functions, Boolean expressions, and
important statistical terms.
✓ Dummies.com online articles: Companion articles to this book’s
content can be found online at www.dummies.com/extras/
exceldataanalysis. The topics range from tips on pivot tables and
timelines to how to buff your Excel formula-building skills.
✓ Downloadable example workbooks: You can download the example
workbooks I use in this book at www.dummies.com/extras/
exceldataanalysis.
✓ Updates: If this book has any updates after printing, they will be posted
to www.dummies.com/extras/exceldataanalysis.
Where toGo fromHere
If you’re just getting started with Excel data analysis, flip the page and start
reading the first chapter.
If you have a bit of skill with Excel or you have a special problem or question,
use the Table of Contents or the index to find out where I cover a topic and
then turn to that page.
Good luck! Have fun!
www.it-ebooks.info
6
Excel Data Analysis For Dummies, 2nd Edition
www.it-ebooks.info
Part I
Where’s the Beef?
Visit www.dummies.com for more great content online.
www.it-ebooks.info
In this part ...
✓ Understand how to build Excel tables that hold and store the
data you need to analyze.
✓ Find quick and easy ways to begin your analysis using simple
statistics, sorting, and filtering.
✓ Get practical stratagems and commonsense tactics for grab-
bing data from extra sources.
✓ Discover tools for cleaning and organizing the raw data you
want to analyze.
www.it-ebooks.info
Chapter 1
Introducing Excel Tables
In This Chapter
▶ Figuring out tables
▶ Building tables
▶ Analyzing tables with simple statistics
▶ Sorting tables
▶ Discovering the difference between using AutoFilter and filtering
F
irst things first. I need to start my discussion of using Excel for data
analysis by introducing Excel tables, or what Excel used to call lists.
Why? Because, except in the simplest of situations, when you want to analyze
data with Excel, you want that data stored in a table. In this chapter, I discuss
what defines an Excel table; how to build, analyze, and sort a table; and why
using filters to create a subtable is useful.
What Is a Table and Why Do I Care?
A table is, well, a list. This definition sounds simplistic, I guess. But take a
look at the simple table shown in Figure1-1. This table shows the items that
you might shop for at a grocery store on the way home from work.
As I mention in the Introduction of this book, many of the Excel workbooks that
you see in the figures of this book are available for download from this book’s
companion website. For more on how to access the companion website, see
the Introduction.
Commonly, tables include more information than Figure1-1 shows. For example,
take a look at the table shown in Figure1-2. In column A, for example, the table
names the store where you might purchase the item. In column C, this expanded
table gives the quantity of some item that you need. In column D, this table
provides a rough estimate of the price.
www.it-ebooks.info
10
Part I: Where’s the Beef?
Figure1-1:
Atable:
Start out
with the
basics.
Figure1-2:
Agrocery
list for
the more
serious
shopper...
like me.
An Excel table usually looks more like the list shown in Figure1-2. Typically,
the table enumerates rather detailed descriptions of numerous items. But a
table in Excel, after you strip away all the details, essentially resembles the
expanded grocery-shopping list shown in Figure1-2.
www.it-ebooks.info
11
Chapter 1: Introducing Excel Tables
Let me make a handful of observations about the table shown in Figure1-2.
First, each column shows a particular sort of information. In the parlance of
database design, each column represents a field. Each field stores the same
sort of information. Column A, for example, shows the store where some item
can be purchased. (You might also say that this is the Store field.) Each piece
of information shown in column A—the Store field—names a store: Sams
Grocery, Hughes Dairy, and Butchermans.
The first row in the Excel worksheet provides field names. For example, in
Figure1-2, row 1 names the four fields that make up the list: Store, Item,
Quantity, and Price. You always use the first row, called the header row, of an
Excel list to name, or identify, the fields in the list.
Starting in row 2, each row represents a record, or item, in the table.A record
is a collection of related fields. For example, the record in row 2 in Figure1-2
shows that at Sams Grocery, you plan to buy two loaves of bread for a price
of $1 each. (Bear with me if these sample prices are wildly off; I usually don’t
do the shopping in my household.)
Row 3 shows or describes another item, coffee, also at Sams Grocery, for $8.
In the same way, the other rows of the super-sized grocery list show items
that you will buy. For each item, the table identifies the store, the item, the
quantity, and the price.
Something to understand about Excel tables
An Excel table is a flat-file database. That flat-
file-ish-ness means that there’s only one table
in the database. And the flat-file-ish-ness also
means that each record stores every bit of
information about an item.
In comparison, popular desktop database appli-
cations such as Microsoft Access are relational
databases. A relational database stores infor-
mation more efficiently. And the most striking
way in which this efficiency appears is that you
don’t see lots of duplicated or redundant infor-
mation in a relational database. In a relational
database, for example, you might not see Sams
Grocery appearing in cells A2, A3, A4, and A5. A
relational database might eliminate this redun-
dancy by having a separate table of grocery
stores.
This point might seem a bit esoteric; however,
you might find it handy when you want to grab
data from a relational database (where the
information is efficiently stored in separate
tables) and then combine all this data into a
super-sized flat-file database in the form of an
Excel list. In Chapter2, I discuss how to grab
data from external databases.
www.it-ebooks.info